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VACCINAL POLYPEPTIDES 

This is a continuation-in-part of pending 
United States patent application Serial Number 751,896; 
which is a continuation-in-part of United States patent 
application Serial Number 387,558; which is a 
continuation-in-part of United States patent application 
Serial Number 238,801, now abandoned; which is a 
continuation-in-part of United States patent application 
Serial Number 645,732, now abandoned. 

Field Of the TnvgTvHnn 

The present invention relates generally to a 
polypeptide useful in a composition for providing 
immunity against influenza A and influenza B in an 
animal . 

Background of the Invention 

Influenza virus infection causes acute 
respiratory disease in man, horses, swine and fowl, 
sometimes of pandemic proportions. Influenza viruses are 
orthomyxoviruses and, as such, have envelope virions of 
80 to 120 nanometers in diameter, with two different 
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glycoprotein spikes. Three types, A, B and C, infect 
humans. Type A viruses have been responsible for the 
majority of human epidemics in modern history, although 
there are also sporadic outbreaks of Typ>e B infections. 
5 Known swine, equine and avian viruses have mostly been 
Type. A, although Type C viruses have also been isolated 
from swine. 

The Type A viruses are divided into subtypes 
based on the antigenic properties of the hemagglutinin 

10 (HA) and neuraminidase (NA) surface glycoproteins. 

Within^ type A, subtypes HI ("swine flu"), H2 ("asian 
flu") and H3 ("Hong Kong flu") are predominant in human 
infections. In swine, the predominant influenza A 
subtypes are HI and H3; in horses, H3 and H7; and in 

15 avians, H5 and H7. Presently only one Type B virus has 
been identified, with no subtypes. 

Genetic "drift" or "shift", i.e., rapid and 
unpredictable change in the antigen, occurs at 
approximately yearly intervals, and affects antigenic 

20 determinants in the HA and NA proteins. Therefore, it 

has not been possible to prepare a "universal" influenza 
virus vaccine using conventional killed or attenuated 
viruses, that is, a vaccine which is non-strain specific. 
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Recently, attempts have been made to prepare such 
universal, or semi-universal, vaccines from reassortant 
viruses prepared by crossing different strains. More 
recently, such attempts have involved recombinant DNA 
techniques focusing primarily on the HA protein. 

There remains a need in the art for vaccine 
formulations and compositions capable of inducing 
protective responses in animals against influenza 
viruses . 

Summary of the inventi^ 

The present invention provides compositions 
containing, and methods for use of, a protein which is 
capable of inducing protection in animals and avians 
against challenge with more than one strain of influenza 
type A and influenza type B. 

Thus, one aspect of the invention provides a 
DNA sequence encoding a modified purified recombinant 
protein. The DNA sequence of the invention encodes a 
modified protein sequence derived from the HA2 subunit of 
a selected hemagglutinin (HA) protein. In one 
embodiment, the sequence is derived from an H3N2 subtype 
influenza virus. These H3N2 fusion proteins are capable 
of inducing T cell responses in the absence of 
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neutralizing antibodies. In another embodiment, a DNA 
sequence of this invention encodes a modified protein 
sequence derived from the HA2 subunit from a type B 
influenza virus. Still further embodiments include DNA 
5 sequences obtained as described for the two above virus, 
where the sequences are derived from other Type A 
influenza strains infecting animals as well as humans. 
Such virus include, without limitation, Type A subtypes 
of HI, H2, H3, H4, H5, H6 and H7 . 

10 I*V another aspect, the invention provides a DNA 

sequence encoding a recombinant fusion protein, in which 
the desired Type A subtype HA2 subunit sequence or a 
portion thereof, is fused in frame to another protein or 
protein fragment capable of enhancing expression of the 

15 fusion protein. One embodiment includes the H3N2 subtype 
HA2 subunit sequence described above fused in frame to 
another protein or fragment capable of enhancing 
expression thereof. Another embodiment of such a fusion 
protein comprises a type B HA2 sequence, described above, 

20 or a portion thereof, fused in frame to another protein 
or protein fragment capable of enhancing expression of 
the fusion protein. Still other Type A subtype HA2 
sequences can be similarly used. It is desirable that 
this fusion partner protein be an influenza protein 

25 sequence or fragment thereof. 
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In still another aspect a protein encoded by a 
DNA sequence of the invention is provided. The protein 
may be a protein sequence derived from the HA2 subunit of 
a hemagglutinin (HA) protein from a selected Type A 
subtype virus. Desirably the subtype virus is an H3N2. 
In another embodiment, the protein may be derived from 
the HA subunit from a type B influenza virus. Other 
embodiments include H5 or H7 subtypes. Additionally, 
preferred embodiments include fusion proteins comprising 
a protein sequence derived from the HA2 subunit of an HA 
protein from a Type A virus, e.g., an H3N2 subtype, or 
from a type B virus fused in frame to a selected 
influenza sequence. The proteins of this invention are 
particularly useful in inducing protection in mammals, 
especially humans, against challenge by type B or an H3N2 
subtype of influenza A. The proteins employing other 
Type A subtypes, e.g., H5 and H7, are useful in inducing 
protection in animals against influenza viruses. 

In a further aspect the invention provides a 
vaccine composition containing a purified protein of the 
invention, as described above. Such a vaccine 
composition may include a fusion protein of the 
invention. In other embodiments of the invention, the 
vaccine compositions contain an H3HA2 protein of the 
invention and other influenza antigens; a type B HA2 
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protein of the invention and other influenza antigens; or 
both an H3HA2 protein, a BHA2 protein and other influenza 
antigens. In a preferred embodiment for human use, a 
combination vaccine of the invention will contain an 
5 H3HA2 and a BHA2 protein of the invention in combination 
with influenza antigens derived from the other type A 
influenza virus subtypes, HI and H2. An embodiment for 
use in animals may contain an H5HA2 or H7HA2 protein, 
among others. 

10 A further aspect of this invention is a method 

for inducing in an animal protection against influenza 
type A, influenza type B, influenza type C, or 
combinations thereof, which comprises internally 
administering to the animal an effective immunogenic 

15 amount of a vaccine composition of the present invention.. 

Still a further aspect of this invention is a 
method for inducing in an animal protection against 
multiple strains of influenza types A and B which 
comprises internally administering to the animal an 

20 effective immunogenic amount of a vaccine composition of 
the present invention. 

Other aspects and advantages of the present 
invention are described further in the following detailed 
description of the preferred embodiments thereof. 
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Brief Description nf t he Drauingp 

Fig. l illustrates the nucleic acid sequences 
of the HA2 portions of (a) A/Udorn (SEQ ID NO: l], (b) 
A/Victoria [SEQ ID NO: 3], (c) A/PR/8/34 [SEQ ID NO: 5], 
5 and (d) a consensus sequence [SEQ ID NO: 7]. Dashes 

indicate the same nucleotide as the consensus sequence. 
Different nucleotides from that of the consensus sequence 
are reported in lower case letters. Dots indicate no 
corresponding nucleotide when compared to the consensus 
10 sequence. 

Pig. 2 illustrates the nucleic acid and amino 
acid sequences of NS1 (14I) H3HA2 (1JU) fusion protein [SEQ ID 
NO: 9 & 10]. 

Pig. 3 illustrates the nucleic acid and amino 
15 acid sequences of the NS l (M , ) H3HA2( 7y . 3J , ) fusion protein [SEQ 
ID NO: 11 & 12]. 

Pig. 4 illustrates the nucleic acid and amino 
acid sequences of the type B fusion protein, NSl,^HA2 4 ,. 2a . 
[SEQ ID NO: 13 & 14]. 

20 Detailed Description of *h«» Tr^or^^ 

The present invention provides novel proteins, 
DNA sequences, pharmaceutical vaccine compositions and 
methods of use thereof for conferring protection in 
vaccinated mammals against one strain, or desirably 
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multiple strains, of influenza viruses. The proteins and 
vaccine compositions of the present invention demonstrate 
the ability to stimulate or produce a protective immune 
response which is capable of recognizing an influenza 
virus or influenza virus-infected cells and protecting 
the vaccinated mammal against disease caused thereby. 
This protective response is desirably a T cell response, 
produced in the substantial absence of vaccine- induced 
neutralizing antibody. 

While the proteins and DNA sequences 
specifically described herein are directed to the H3HA2 
and BHA2 sequences originating from viral strains to 
which humans are susceptible, it is expected that similar 
sequences and molecules can be prepared for veterinary 
applications . For example r selected HA2 sequences 
obtained from type A viral strains, e.g., H5HA2 , H7HA2 
and other strains of interest may be obtained following 
the teachings described herein for the exemplified H3HA2 
and BHA2 sequences. One of skill in the art should 
understand that this invention is not limited to the 
exemplified protein and DNA sequences, even though the 
following disclosure is limited to the two latter 
sequences for simplicity. Such additional viral HA2 
subunits are expiected to share the biological 
characteristics of the exemplified sequences. 

8 
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Thus, this invention provides a protein or 
fragment thereof characterized by an amino acid sequence 
derived from the HA2 subunit of a hemagglutinin (HA) 
protein, e.g., from a H3N2 subtype virus. The H3 
proteins of the invention are capable of inducing T 
helper cells, particularly cytotoxic T lymphocytes, in 
the absence of neutralizing antibodies. Among H3N2 
subtype strains of influenza A include A/Udorn and 
A/Victoria viruses. Other H3N2 virus strains of 
influenza A may also produce HA proteins for use in 
vaccine compositions according to this invention. Fig. 1 
compares the nucleic acid sequences of the HA2 portions 
of the A/Udora [SEQ ID NO: 1] and A/ Victoria [SEQ ID NO: 
3] strains with the nucleic acid sequence of an H1N1 
subtype virus, A/PR/8/34 [SEQ ID NO: 5]. A consensus 
sequence [SEQ ID NO: 7] was computer generated, and may 
likewise be useful in producing proteins according to 
this invention. This consensus sequence [SEQ ID NO: 7] 
can be constructed by a commercially available 
computerized sequence analysis program, such as Genetics 
Computers Group [Univeristy of Wisconsin] . 

Proteins according to this invention may 
include unfused HA2 subunits of the influenza A viruses, 
particularly H3N2 subtype. For example, in one 
embodiment, a protein of the invention contains amino 
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acids 1-221 of a selected H3HA2 subunit. In another 
embodiment, a protein of the invention contains amino 
acids 77-221 of the H3HA2 subunit. Other fragments of 
this HA2 amino acid sequence characterized by the ability 
to stimulate similar immunological activity in an 
immunized animal are also encompassed by this invention. 

Proteins of this invention also include fusion 
proteins comprising a protein sequence derived from the 
HA2 subunit of an HA protein from a Type A virus, e.g., 
an H3N2 subtype virus, fused in frame to another protein 
or protein fragment capable of enhancing expression of 
the fusion protein. It is desirable that this fusion 
"partner" protein be an influenza protein sequence or 
fragment thereof derived from the same or another strain 
of influenza virus as the HA protein or protein fragment. 
Preferably, this fusion partner protein is all or a 
portion of the influenza virus NS1 gene or an HA2 
subunit. 

In the embodiments exemplified herein, the NS1 
portion of the fusion protein' is derived from an H1N1 
subtype virus, A/PR/8/34. For example, in one 
embodiment, the NS1 portion may comprise amino acid 
residues 1 to 42 of H1NS1. In another embodiment the NS1 
portion may comprise amino acid residues 1 to 81 of the 
selected virus. The HA2 fragment may alternatively be 
fused to a portion of the NS1 peptide derived from a 

10 
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selected Type A virus, e.g., an H3 subtype virus (H3HA2), 
or a type B (BHA2) virus. 

However, other non-influenza fusion proteins 
may also produce desirable fusion proteins with the H3N2, 
or other Type A, or type B protein or portion thereof. 
Thus, in still another alternative embodiment, as 
discussed below, the HA2 fragment may be fused to any 
peptide capable of enhancing its expression in the host 
cell selected. One of skill in the art may readily 
select a fusion "partner" protein or fragment taking into 
account the desired host cell and utilizing the teachings 
herein. The fusion proteins of the present invention are 
not limited by the selection of the "partner" protein or 
fragment to which the HA2 fragment is fused. 

In yet another embodiment, the present 
invention provides a modified protein containing a 
portion of the HA2 subunit of a type B influenza virus. 
Currently, the preferred human virus strain is B/Lee/40. 
However, the vaccinal proteins of this invention are not 
limited to this type B strain", and other strains 
infecting other species, or other as yet unidentified 
type B virus strains, may be used to produce the HA2 
protein. These type B HA2 proteins may be fused, as 
described above for the H3HA2 proteins of this invention, 
or remain unfused. 
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In the construction of a fusion protein 
according to this invention , a linker sequence may be 
inserted optionally between the twp fused sequences, 
i.e., between the NS1 portion and the HA2 portion. This 
5 optional linker may provide space between the two linked 
sequences. Alternatively, this linker sequence may 
encode, if desired, a polypeptide which is selectively 
cleavable or digestible by conventional chemical or 
enzymatic methods. For example, the selected cleavage 

10 site may be an enzymatic cleavage site, including sites 

for cleavage by a proteolytic enzyme, such as 
enterokinase, factor Xa, trypsin, collagenase and 
thrombin. Alternatively , the cleavage site in the linker 
may be a site capable of being cleaved upon exposure to a 

15 selected chemical, e.g., cyanogen bromide or 

hydroxy lamine. The cleavage site, if inserted into a 
linker useful in the fusion sequences of this invention, 
does not limit this invention. Any desired cleavage 
site, of which many are known in the art, may be used for 

20 this purpose. 

A presently preferred example of a fusion 
protein of this invention is NSl oai) H3HA2 a . 2Z1) [SEQ ID NO: 
10], which comprises the first 81 amino acids of NS1 
fused to amino acid 1 to 221 of the H3HA2 subunit (amino 

25 acids 1-221). Another exemplary fusion protein, NS1 0 . 

%l) H31ik2^ m) [SEQ ID NO: 12], comprises the first 81 amino 
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acids of NS1 fused to amino acid 77 to 221 of the 
truncated H3HA2 subunit. Yet another preferred example 
of a fusion protein of this invention is NS1 1 ^BHA2 41 . 223 [SEQ 
ID NO: 14], which comprises the first 42 amino acids of 
NS1 fused to amino acids 41 to 223 of the truncated BHA2 
subunit. These proteins, fusion proteins and similar 
proteins encoded by the below-described DNA sequences are 
referred to collectively herein as H3HA2 proteins. 

The NSl (M1) H3HA2 0 . ni) protein [SEQ ID NO: 10] of 
the invention has a three-dimensional structure which is 
substantially similar to that of the NSl^jHaS^ protein 
[SEQ ID NO: 16] derived from the H1N1 subtype virus 
(C13). However, the amino acid sequence of the NS1„. 
81) H3HA2 (1 . 2JI) protein [SEQ ID NO: 10] has only approximately 
50% homology with the amino acid sequence of C13 protein 
[SEQ ID NO: 16]. Additionally, as illustrated in Fig. l, 
the nucleic acid sequence of the H3HA2..3J, fragment derived 
from A/Udorn (nucleotides 25-560 from that virus) [SEQ ID 
NO: 1] has only approximately 60% homology with the 
nucleic acid sequence of the HlHA2,. 2a protein derived from 
strain A/PR/8/34 (nucleotides 1872-2407 from A/PR/8/34) 
[SEQ ID NO: 5]. However, the nucleic acid sequence of 
H3HA2,.j2, f *"°™ A/Udora (nucleotides 1-499 of A/Udorn) [SEQ 
ID NO: 1] has approximately 99% homology with the nucleic 
acid sequence of H3HA2 t .22i from A/Victoria/H3/75 
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(nucleotides 1226-1725 of A/Victoria) [SEQ ID NO: 3] 
[Fiers et al. Cell . 12:683-696 (1980)]. 

Analogs of the HA2 peptides from a Type A 
virus, e.g., an H3, or B viruses, included within the 
5 definition of this invention, include truncated 

polypeptides (including fragments) and HA2 polypeptides, 
e.g. mutants that retain the epitopes and thus the 
biological activity of HA2. It is anticipated that, 
because the NS1 portion of the fusion peptide provides a 

10 means of expressing the protein at high levels and does 
not appear to play as significant a role in the 
immunological responses to the HA2 fusion proteins as 
does the HA2 portion, any number of analogs of this 
fusion partner can be made. 

15 Typically, the analogs of the HA2 peptides 

and/or the fusion partner differ by only 1 to about 4 
codon changes. Other examples of analogs include 
polypeptides with minor amino acid variations from the 
natural amino acid sequence of HA2; in particular, 

20 conservative amino acid replacements. Conservative 

replacements are those that take place within a family of 
amino acids that are related in their side chains. 
Genetically encoded amino acids are generally divided 
into four families: (1) acidic = aspartate, glutamate; 

25 (2) basic - lysine, arginine, histidine; (3) non-polar = 

alanine, valine, leucine, isoleucine, proline, 
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phenylalanine, methionine, tryptophan; and (4) uncharged 
polar « glycine, asparagine, glutamine, cysteine, serine, 
threonine, tyrosine. Phenylalanine, tryptophan, and 
tyrosine are sometimes classified jointly as aromatic 
5 amino acids. For example, it is reasonable to expect 
that an isolated replacement of a leucine with an 
isoieucine or valine, an a.sr>artate with a glutamate, a 
threonine with a serine, or a similar conservative 
replacement of an amino acid with a structurally related 

10 amino acid will not have a significant effect on its 

activity, especially if the replacement does not involve 
an amino acid at an epitope of the HA2 polypeptide. 
The construction of such analogs, given the description 
herein and conventional methods of protein modification 

15 known to one of skill in the art, are believed to be 
encompassed by this invention. 

Currently, it is theorized that the HA2 portion 
of the fusion peptide (e.g., H3HA2 1.221 / H3HA2th2i and 
BHA2 4I .22j) confers the majority of the necessary epitopes 

20 for antibody binding or T cell (particularly CTL) 

targeting. Once these epitope sequences are precisely 
identified, portions of the HA2 sequence which are not 
part of these epitopes may be altered without 
significantly affecting the bioactivity of the fusion 

25 protein. 
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The present invention also encompasses DNA 
sequences of this invention encoding the above-described 
proteins and fusion proteins, the sequences characterized 
by having an immunogenic determinant of a modif ied HA2 
subunit of an HA protein, derived from a Type A virus, 
e.g., an H3 subtype, or type B virus. Other DNA 
sequences of this invention encode such HA2 subunits, 
optionally fused to a DNA sequence encoding a protein or 
peptide which is capable of enhancing expression of the 
protein in a selected host cell. For example, the 
consensus sequence illustrated in Fig. 1(d) may provide a 
source of HA2 DNA. The currently preferred embodiment 
provides a DNA sequence encoding a Type A virus, e.g., an 
H3 or type B HA2 protein or fragment thereof fused in 
frame to a DNA sequence encoding a portion of the 
nonstructural influenza protein 1 (NSl) . 

Coding sequences for the HA2 , NSl and other 
viral proteins of influenza virus can be prepared 
synthetically or can be derived from viral RNA or from 
available cDNA-containing plaismids by known techniques. 
For example, in addition to the above-cited references, a 
DNA coding sequence for HA from the A/Japan/305/57 strain 
was cloned, sequenced and reported by Gething et al, 
Nature . 287:301-306 (1980). An HA coding sequence for 
strain A/NT/60/68 was cloned as reported by Sleigh et al, 
and by Both et al, in Developments in Cell Biology . 
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Elsevier Science Publishing Co. , pages 69-79 and 81-89, 
respectively, (1980). An HA coding sequence for strain 
A/WSN/33 was cloned as reported by Davis et al, Gene . 
10:205-218 (1980); and by Hiti et al, Virology , Hi:ll3- 
124 (1981). An HA coding sequence for fowl plague virus 
was cloned as reported by Porter et al and by Emtage et 
al, both in Developments in Cell Biology , cited above, at 
pages 39-4 9 and 157-168. Also, influenza viruses, 
including other strains, subtypes and types, are 
available from clinical specimens and from public 
depositories, such as the American Type Culture 
Collection (ATCC) , Rockville, Maryland, U.S.A. 

Allelic variations (naturally-occurring base 
changes in the species population which may or may not 
result in an amino acid change) of DNA sequences encoding 
the H3HA2 or BHA2 protein sequences are also included in 
the present invention, as well as analogs or derivatives 
thereof. Similarly, DNA sequences which code for H3 or 
other Type A or type B HA2 proteins of the invention but 
which differ in codon sequence due to the degeneracies of 
the genetic code or variations in the DNA sequence 
encoding H3HA2, other Type A or BHA2 proteins which are 
caused by point mutations or by induced modifications to 
enhance the activity, half -life or production of the 
peptide encoded thereby are also encompassed in the 
invention. Also covered by this invention are DNA 
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sequences which hybridize under stringent conditions with 
the DNA sequences encoding the HA2 subunit proteins, 
e.g., H3HA2 or BHA2 proteins, of this invention- DNA 
sequences which hybridize under non-stringent conditions 
with the disclosed sequences, but which encode proteins 
or fragments retaining the biological activities of the 
H3HA2 or BHA2 proteins , are also included in this 
invention. Typical conditions for stringent or non- 
stringent hybridization are known to those of skill in 
the art. [See, e.g., Sambrook et al, Molecular Cloning. 
A Laboratory Manual, 2nd edition, Cold Spring Harbor 
Laboratory, NY (1989)]. 

The fusion proteins of the invention may be 
prepared by conventional genetic engineering and 
recombinant techniques known to those of skill in the 
art. Similarly, the proteins may be purified from 
expression in host cell or vector systems by conventional 
means. 

Systems for cloning and expression of the 
vaccinal polypeptide of this invention in various 
microorganisms and cells, including, for example, E. 
coli. Bacillus. Streptomvces . Saccharomvces . mammalian 
and insect cells, are known and available from private 
and public laboratories and depositories and from 
commercial vendors. The preferred host is JE^ coli 
because it can be used to produce large amounts of . 
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desired proteins safely and cheaply. The polypeptide 
employed in the presently preferred embodiment is 
expressed in E. coli. To circumvent the requirement of 
ampicillin for plasmid selection in production 
fermentations, a preferred method of production employs 
an alternative expression system in which the ^-lactamase 
coding sequence is wholly or partially replaced by a 
coding sequence for an alternative selectable marker such 
as, for example, kanamycin or chloramphenicol. 

To aid in expression of the H3 or other Type A 
subunit or type B HA2 peptides or fusion protein 
described above, these protein sequences or fragments 
thereof may also be fused to a polypeptide capable of 
enhancing expression of these fragments in the selected 
host system. Ordinarily, such a peptide would contain a 
leader sequence fragment that provides for secretion of 
the Type A subunit fragment, e.g., the H3HA2 fragment, or 
type B HA2 fragment in the host cell. The leader 
sequence fragment typically encodes a signal peptide 
comprised of hydrophobic amino acids which direct the 
secretion of the protein from the cell . There may be 
processing sites encoded between the leader sequence and 
the Type A subtype or type B HA2 fragment that can be 
cleaved either in vivo or in vitro . Alternatively, a 
promoter sequence may be linked directly with the DNA 
molecule encoding the HA2 fragment. Such polypeptides, 
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promoter and leader sequences are known to those of skill 
in the art and may be readily selected for expression in 
the selected host. 

Construction of expression systems , including 
expression vectors and transformed host cells are thus 
within the art. See, generally, methods described in 
standard texts, such as Sambrook et al f Molecular Cloning 
A Laboratory Ma^ua ) f 2d edit., Cold Spring Harbor 
Laboratory, Cold Spring Harbor, NY (1989) • The present 
invention is therefore not limited to any particular 
expression system or vector, nor to any particular 
purification process from cell lysates or cell medium. 

The proteins and fusion proteins of this 
invention may be employed in vaccine compositions. 
Pharmaceutical vaccine compositions of this invention, 
therefore, contain an effective immunogenic amount of a 
selected HA2 protein, e.g., H3HA2 or BHA2 protein, of the 
invention in admixture with a suitable adjuvant in a 
nontoxic and sterile pharmaceutically acceptable carrier. 

Suitable carriers for vaccine use are well 
known to those of skill in the art. However, exemplary 
carriers include sterile saline, lactose, sucrose, 
calcium phosphate, gelatin, dextrin, agar , pectin, peanut 
oil, olive oil, sesame oil, sgualene and water. 
Additionally, the carrier or diluent may include a time 
delay material, such as glyceryl monostearate or glyceryl 
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distearate alone or with a wax. Optionally, suitable 
chemical stabilizers may be used to improve the stability 
of the pharmaceutical preparation. Suitable chemical 
stabilizers are well known to those of skill in the art 
and include, for example, citric acid and other agents to 
adjust pH, chelating or sequestering agents, and 
antioxidants . 

While any aluminum adjuvant may be used in the 
vaccine compositions of this invention, two desirable 
adjuvants are commercially marketed under the trademarks 
Rehsorptar [Armour Pharmaceuticals, Kankakee, 1L] and 
Rehydragel [Reheis Chemical Co., Berkeley Heights, NJ] . 
These products are aluminum hydroxide gels which contain 
approximately 2% w/v A1.0 3 , which is equivalent to 
approximately 10.6 mg/r A1+ 3 . 

Vaccine compositions of this invention may 
employ an immunogenic amount of a purif ied recombinant 
protein as described above. A preferred embodiment of 
the vaccine of the invention is composed of an aqueous 
suspension or solution containing the recombinant HA2 
protein molecule, e.g., H3HA2 or BHA2, together with an 
adjuvant, preferably an aluminum, most preferably 
aluminum hydroxide, buffered at physiological pH, in a 
form ready for injection. A preferred protein for use in 
these vaccine compositions includes a protein comprising 
amino acid residues l to 81 from NS1 fused to C-terminal 
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amino acid residues 1-221 from the hemagglutinin subunit 
2 (HA2) from influenza A, subtype H3N2. Another 
preferred vaccine composition of this invention employs a 
purified recombinant protein made up of amino acid 
residues l to 81 from NSl fused to amino acid residues 
77-221 of the HA2 from influenza A, subtype H3N2. Still 
another preferred vaccine composition of this invention 
employs a purified recombinant protein made up of amino 
acid residues 1 to 42 fused to amino acid residues 41-223 
of the HA2 from influenza B. 

Vaccine compositions of the invention may also 
employ an immunogenic amount of a recombinant protein of 
the invention in combination with other influenza 
antigens. Suitable influenza antigens for combination in 
a vaccine composition with the proteins of this invention 
may be derived from type A, HI subtype viruses and may 
include the recombinant fusion proteins described in 
detail in copending U. S. Patent Application Ser. No. 
07/387,200, filed July 28, 1989 and its corresponding 
European Patent Application No. 366, 238, published May 
2, 1990; and in co-pending U. S. Patent Application Ser. 
No. 07/387,558, filed July 28, 1989 and its corresponding 
European Patent Application No. 366,239, published May 2, 
1990. The C13 protein (NSl,,^^,,.^) [SEQ ID NO: 15 & 
16], D protein (NSl (14 , ) HA2 (6 j. 222) ) [SEQ ID NO: 17 & 18) and 
other fusion proteins derived from the H1N1 influenza 
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virus subtype and the recombinant expression and 
purification thereof are disclosed in detail in these 
applications, and in the parent applications identified 
in this application, all of which are incorporated by 
reference herein. 

More specifically, suitable HI subtype 
immunogenic proteins include C13 (NS1 (MI) -D-L-S-R-HA2 (IJ2 2 ) ) 
[SEQ ID NO: 15 & 16], D (NSl (M1) -Q-I-P-HA2 (4J . ja) ) [SEQ ID NO: 
17 & 18], C13 short (NSl (1- , J) -M-D-L-S-R-HA2 (I .j a) ) [SEQ ID NO: 
19 & 20), D Short (NS1 (1 ^,-M-D-H-M-L-T-S-T-R-S-HA2 ,84.222,) 
[SEQ ID NO: 21 & 22], A (NSl Q41) -Q-I-P-HA2 (6Wa) ) [SEQ ID NO: 
23 & 24), C (NSl^-Q-I-P-HAa,,,^,) [SEQ ID NO: 25 & 26), 
4° (NSl^IU^Ma,) [SEQ ID NO: 27] , A13 (NSl 04 , r D-L-S-R- 
HAVtoTS-C-L-T-A-Y-H-R) [SEQ ID NO: 28), M (NSl^-Q-I-P- 
HA2 (6M96) -G-G-S-Y-S-M-E-H-F-R-W-G-K-P-V) [SEQ ID NO: 29], AM 
(NS1 (MI) -Q-I-P-HA2 (4W96) -G-G-S-Y-S-M-L-V-N) [SEQ ID NO: 30] , 
4M+ (NSl Mir Q-I-P-HA2 (4J . M0) -L-V-L-L) [SEQ ID NO: 31 & 32). 
These H1N1 fusion proteins are described in published 
European Patent Application 366,238 and in copending U.S. 
Patent Application Ser. No. 07/751,896. Other suitable 
HI proteins consist of unfused polypeptides, such as 
HlHA2 (i . 233 [SEQ ID NO: 33 & 34] which is disclosed in co- 
pending U. S. Patent Application Ser. No. 07/751,898, 
incorporated herein by reference. Thus, one desirable 
combination vaccine to provide protection against Type A 
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influenza contains NS1 (141) H3HA2 ( ,. 221) protein [SEQ ID NO: 9 & 
10] of the invention, one or more proteins derived from 
subtype H1N1 as described above, and an aluminum 
adjuvant. 

Preferably, a combination vaccine of the 
invention will contain an immunogenic amount of the H3 
fusion protein of the invention in combination with 
immunogenic amounts of influenza antigens derived from 
the other type A influenza virus subtypes, including 
among others, HI, H2, H3, H4, H5, H6 and H7 as well as a 
type B fusion protein of the invention. Therefore, other 
preferred combination vaccines would include the NS1 (1 . 
gl) H3HA2 (77 . 221) protein [SEQ ID NO: 11 & 12] in combination 
with one or more additional influenza antigens derived 
from the type or subtype influenza viruses described 
above. Thus, the combination vaccine will protect 
against influenza infections caused by both type A and 
type B influenza viruses ♦ Still other combination 
vaccine compositions will employ other proteins described 
herein. 

The compositions of the present invention are 
advantageously made up in a dose unit form adapted for 
the desired mode of administration. Each unit will 
contain, at a minimum, a predetermined quantity of the 
selected HA2 subunit protein, e.g., H3HA2 protein and/ or 
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BHA2 protein, and adjuvant calculated to produce the 
desired therapeutic effect in optional association with a 
pharmaceutical diluent, carrier, or vehicle. 

Dosage protocol can be optimized in accordance 
with standard vaccination practices. Typically, the 
vaccine will be administered intramuscularly, although 
other routes of administration may be used, such as 
intradermal. It is expected that an effective 
immunogenic amount of a protein, fusion protein or 
combination of proteins of this invention for average 
adult humans is in the range of l to 1000 micrograms. 
Another desirable immunogenic amount ranges between 50 to 
500 micrograms. Most preferably, the proteins of the 
invention are in admixture with the same amount or more 
adjuvant to form a vaccine composition. 

While the proteins described herein have been 
particularly developed for use in humans (e.g., the H3HA2 
and BHA2 sequences) , it is expected that due to species 
cross-reactivity, these vaccines will be useful in other 
animals, particularly swine. " Additionally, similar 
molecules can be prepared for equine and avian veterinary 
applications utilizing the HA2 proteins from other 
strains to which animals are susceptible. Combination 
vaccines for use in swine would preferably include 
protections against both HI and H3 viruses. Combination 
vaccines for use in equine would preferably include 
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protection against H3 and H7 viruses, combination 
vaccines for use in avian species would preferably confer 
protection against H5 and H7 viruses. Appropriate 
dosages can be determined by one skilled in veterinary 
medicine. 

It will be understood, however, that the 
specific effective immunogenic amount for any particular 
patient will depend upon a variety of factors including 
the age, general health, sex, and diet of the vaccinee; 
the species of the vaccinee; the time of administration; 
the route of administration; interactions with any other 
drugs being administered; and the degree of protection 
being sought. 

The vaccine can be administered initially in 
late summer or early fall and can be readministered two 
to six weeks later, if desirable, or periodically as 
immunity wanes, for example, every two to five years. 
Of course, as stated above, the administration can be 
repeated at suitable intervals if necessary or desirable. 

The following examples illustrate methods for 
preparing H3HA2 and BHA2 fusion proteins of the invention 
and demonstrate the subtype specific protection against 
heterologous virus induced upon vaccination with the 
H3HA2 proteins. These examples are illustrative only and 
do not limit the scope of the invention. 
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EXAMPLE 1 - PIASMID PMS3H3HA 

Plasmid pFV88 contains the entire 221 amino 
acid length HA from A/Udorn, an H3 subtype virus [C. J. 
Lai et al, Proc. Natl. Acad. Sci. USA . 77:210-214 
(1980)3, which HA nucleic acid sequence is illustrated in 
Fig. 1 [SEQ ID NO: 1]. This plasmid was cut with Pst I. 
The resulting 1900 bp fragment, which contains the entire 
HA (HA1 and HA2) fragment and some GC tailing, was then 
inserted into pUClS [Bethesda Research Laboratories]. 
The resulting plasmid is termed pMS3 or pMS3H3HA. 

EXAMPLE 2 - nPMCI 

Plasmid pAPR80l is a pBR322-derived cloning 
vector which carries the NS1 coding region (A/PR/8/34) . 
It is described by Young et al, in The Origin of Pandemic 
Influenza Viruses, ed. by W. G. Laver, Elsevier Science 
Publishing Co. (1983). 

Plasmid pASl is a pBR322-derived expression 
vector which contains the P L promoter, an N utilization 
site (to relieve transcriptional polarity effects in the 
presence of N protein) and the ell ribosome binding site 
including the ell translation initiation codon followed 
immediately by a BamHI site. It is described by 
Rosenberg et al, in Methods Enzvmol. - 101 :123-138 (1983). 
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Plasmid pASlAEH was prepared by deleting a non- 
essential EcoRI-Hindlll region of pBR322 origin from 
pASl. A 1236 base pair BamHI fragment of pAPR801 f 
containing the NS1 coding region in 861 base pairs of 
viral origin and 375 base pairs of pBR322 origin, was 
inserted into the BamHI site of pASlAEH. The resulting 
plasmid, pASlAEH/801 expresses authentic NS1 (230 amino 
acids) . The plasmid has an Ncol site between the codons 
for amino acids 81 and 82 and an Nrul site 3' to the NS 
sequences. The BamHI site between amino acids 1 and 2 is 
retained. 

Plasmid pMG27N, a pASl derivative f Mol. Cell. 
Biol. . 5:1015-1024 (1985)], was cut with BamHI and Sad 
and ligated to a BamHI /Ncol fragment encoding the first 
81 amino acids of NS1 from pASlAEH80l and a synthetic DNA 
Ncol /Sad fragment of the following sequence: 
SEQ ID NO: 35: 

5 1 -CATGGATCATATGTTAACAGATATCAAGGCCTGACTGACTGAGAGCT-3 1 
SEQ ID NO: 36: 

3 1 - CTAGTATACAATTGTCTATAGTTCCGGACTGACTGACTC -5' 

The resulting plasmid, pMGl, allows the 
insertion of DNA fragments after the first 81 amino acids 
of NS1 in any of the three reading frames within the 
synthetic linker fragment followed by termination codons 
in all three reading frames. 
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EXAMPLE 3 - PMG1H3HA 

Plasmid pMGl, described above in Example 2, was 
digested with Ncol and Xbal, releasing a 54 bp fragment, 
which was discarded. pMS3H3HA, described in Example 1 
above, was digested with Hhal and Xbal, and a 701 bp 
fragment containing the coding sequence for the HA2 
subunit of influenza strain A/Udom (H3N2) was isolated, 
as illustrated in Fig. 1 [SEQ ID NO: 1]. 

Synthetic oligonucleotides were annealed to 
generate -an Ncol 5' overhang sequence (at the 5* end) and 
a Hhal 3 • overhang sequence (at the 3 ■ end) . The 
sequence of these oligonucleotides is as follows: 
SEQ ID NO: 37: 5 ^-CATGGGCGCCCATATGGGCATATTCGGCG-3 • 

SEQ ID NO: 38: 3«- CCGCGGGTATACCCGTATAAGCC -5' 

The annealing reaction was performed as follows. The 
annealing mixture was made up of 2.5jiL each of 5 1 oligo 
(1.3 M9/ML), the 3 1 oligo (1.2 /ig/fiL) , and added water 
(15 ML) to a final volume of 20 ^L. The reaction tubes 
were then placed in 4 mL culture tubes containing water 
which had been heated to 65°C for 10 minutes and allowed 
to cool down slowly. The tubes were then put on ice and 
used immediately for ligation. 

This three part ligation generates pMGlH3HA2 (1 . 22n 
[SEQ ID NO: 9] which codes for the first 81 amino acids 
of NS1 fused to four amino acids donated from the linker 
and amino acids 1-221 of the HA2 subunit. This sequence 
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is illustrated in Fig. 2 [SEQ ID NO: 9 & 10]. This 
molecule is also designated NSl ( , a , ) H3HA2 0 . 22l) [ SEQ ID NO: 9 & 
10] . 

™ ^ - NSl (M1) a3BM<77.an [SEQ ID NO: 11 & 12] 

PMS3H3HA, described in Example l above, was 
digested with EcoRI and end-filled (Klenow) . 
Subsequently, the vector was digested with Xbal. A 487 
bp fragment , which contains the coding sequence for amino 
acids 77-221 of the HA2 subunit, was isolated and ligated 
to the Hpal and Xbal sites of pMGl. The resulting vector 
codes for a fusion polypeptide containing amino acids l- 
81 of NS1 fused to amino acids 77-221 of the HA2 subunit. 
This molecule has been termed NS1 (I41) H3HA2 (77 . B1) and is 
illustrated in Fig. 3 [SEQ ID NO: 11 & 12]. 

EXAMPLE 5 - nMG^BI.HA? 

To derive a vector similar to pMGl (described 
in Example 2) , which' contains the coding region for the 
first 42 amino acids of NS1 father than the first 81 
20 amino acids of NS1, pMGl was digested with BamHI and Ncol 
and ligated to the BamHI /Ncol fragment encoding amino 
acids 2 to 42 of NSl from pNSl^TGFo. pNSl^TGFa is 
derived when pASlAEH8dl is cut with Ncol and Sail and 
ligated to a synthetic DNA encoding human TGFa as an 
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Ncol/Sall fragment, pNSl 42 TGFa encodes a protein 
comprised of the first 42 amino acids of NS1 and the 
mature TGFot sequence. The NS1 portion of pNSl 42 TGFa 
contains an amino acid change from Cys to Ser at amino 
5 acid #13. 

The resulting plasmid, termed pMG 42 A, was then 
modified to contain an alternative synthetic linker after 
the NS1 42 sequence with a different set of restriction 
enzyme sites within which to insert foreign DNA fragments 
10 into the three reading frames after the NSl^. This 
linker has the following sequence: 
SEQ ID NO: 39: 

5 i -CATGGATCATATGTTAACAAGTACTCGATATCAATGAGTGACTGAAGCT-3 • 
SEQ ID NO: 40: 

15 3 ■ - CTAGTATACAATTGTTCATGAGCTATAGTTACTCACTGACT -5 • 

The resulting plasmid is called pMG 42 B. This vector is 
needed to contain the neomycin phosphotransf erase-1 (NPT- 
1) gene which confers kanamycin resistance. 

As described in Shatzman and Rosenberg, Met. 

20 Bnzvmol. . 152:661-673 (1987)/ pOTS207 is a pAS derived 

cloning vector which carries the kanamycin resistance 
gene from Tn903 [Berg et al. Microbiology , ed. D. 
Schlessinger f pp. 13-15, American Society for 
Microbiology (Washington, DC 1978); Nomura et al, The 

25 Single-Stranded DNA Phages , ed. D. Denhardt et al, 
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pp. 467-472, Cold Spring Harbor Laboratory (New York 
1978); Castellazzi et al, Molecul, Gen. Genet. . 117:211* 
218 (1982)]. it was constructed by digesting plasmid 
. pUCB [Yanisch-Perron et al, Gene, 23:103-119 (1985) ], 
5 with BamHI and ligated to a Bell fragment containing the 
kanamycin gene from Tn903. The resulting plasmid, pUC8- 
Kan, was digested with EcoRI and PstZ, and the fragment 
containing the kanamycin gene was inserted between the 
EcoRI and PstI sites of pOTSV [ Shatzman and Rosenberg, 

10 cited above]. The resulting plasmid is pOTS207. 

The pOTS207 was digested with EcoRI and PstI, 
and the 1467 bp fragment containing the kanamycin 
resistance gene was isolated. Synthetic 
oligonucleotides: 

15 SEQ ID NO: 41: 5 9 AATTCGTACCTA 3* 

SEQ ID NO: 42: 3 f GCATGGATCTAG 5 f 

were made to link the NPT-l gene to pMG42B vector. pMG 42 B 
was digested with Bglll and PstI. The EcoRI/PstI NPT-1 
gene fragment and the synthetic oligo linker were ligated 
20 to the digested pMG 42 B. The resulting plasmid, pMG 42 Kn 

allows fusions, in three different reading frames, to the 
NS M2 gene, while allowing antibiotic selection with 
kanamycin. 

Plasmid pBHA is a pBR322-derived vector, 
25 containing the complete nucleotide sequence of the 
hemagglutinin (HA) gene of a type B influenza virus 
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(B/Lee/40). It is described by Krystal et al f Proc. 
Natl. Acad. Sci. n^ r 79:4900-4804 (1982). pBHA was 
digested with Rsal and a 813 bp fragment containing the 
HA subunit vas isolated. This fragment was ligated into 
5 plasmid pMG 42 Kn (described above) that had been digested 
with Seal. During the cloning, a base (T) was deleted 
from the Seal recognition site shifting the gene out of 
the reading frame. The vector was digested with Ncol, 
and filled-in using Klenow, putting the gene back into 

10 the reading frame. 

The resulting construct, pMG 42 BLHA2 [SEQ ID NO: 
14], expresses a fusion polypeptide containing amino 
acids 1-42 of NS1 and 41-233 of the HA2 subunit. This 
construct contains the Cys to Ser change at amino acid 

15 #13 of the NS1 portion of the fusion peptide. 

In preliminary studies with this construct, 
vaccinated laboratory mice demonstrated protection from 
challenge with type B influenza in the absence of 
neutralizing antibody for the virus. 

20 EXAMPLE 6 - PREPARING SEED VI RUS AND RAISING ANTISERA 

The seed virus, A/Udorn, was prepared according 
to the procedures described in P. Palese and J. Schulman, 
Virol. . 57:227-237 (1974). Briefly, this technique is as 
follows. 
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Influenza virus strain A/Udorn was inoculated 
in 10-day old embryonated hen's eggs into the allantoic 
cavity. The eggs were incubated for 24-48 hours at 35 °C 
then chilled at 4 °C overnight. A portion of the eggshell 
over the airsac was removed and the allantoic fluid was 
aseptically removed using a 10-ml syringe. The fluid was 
centrifuged at low speed (3,000 x g) to remove 
particulates. This clarified supernatant was centrifuged 
at high speed using an SW28 Beckman rotor at 27 , 000 rpm 
(4°C for 90 minutes), resulting in the virus pellet. The 
virus was resuspended in 10 mM Tris (pH 7.5) containing 
100 mM NaCI, 1 mM EDTA and repelleted as before. The 
virus was layered on 30-60% sucrose gradient in 1 mM EDTA 
(NTE) and spun for 3-5 hours at 25,000 rpm. The band in 
the middle of the tube was withdrawn, diluted in NTE and 
centrifuged at 27,000 rpm for 90 minutes. The pellet was 
suspended in phosphate-buffered saline (PBS) . These 
viral particles were used as immunogens for preparation 
of antisera. 

Antisera was prepared as follows. 100-200 
micrograms of purified virus in complete Freund's 
adjuvant was injected into the subscapula of a New 
Zealand White rabbit. A second injection in incomplete 
Freund's adjuvant was done 4 weeks later, and the animals 
were bled 7-10 days later. 
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EXAMPLE 7 - EXPRES SION OF W?HA2 FUSTON PROTEINS 

A. NSl„^ 1) H3HA2 n . ai) [SEQ ID NO: 9 & 10] 

The plasmid pMGlH3HA2 (I . 22I) [SEQ ID NO: 9) was 
transfected into E^. colj, strain AR58 [SmithKline Beecham 
Pharmaceuticals]. Cultures were grown at 32 °C to mid-log 
phase at which time cultures were shifted to 39.5°c for 2 
hours. The £. coXX cell pellets containing the 
recombinant polypeptide were then stored at -70°C until 
used . 

Production of the NSl CM1) H3HA2 cl . a21) protein [SEQ ID 
NO: 10] was confirmed by Western blot analysis [Towbin et 
a1 ' Proc. Natl. Acad. Sci. D.S.A. . 7_£:4350 (1979)] using 
antisera prepared against A/Udorn virus, as described in 
Example 5. A major immunoreactive species was found at a 
molecular weight of 35,050 daltons. 

B. NSl (14I) H3HA2 (7 ,. ai) [SEQ ID NO: 11 & 12) 

The plasmid encoding the NSl (1 ^ 1) H3HA2 (njJ , ) peptide 
[SEQ ID NO: 11 & 12] was expressed as described in part A 
above. Production of this peptide was confirmed by 
Western blot analysis, as described above. A major 
immunoreactive species was found at a molecular weight , of 
26,697 daltons. 
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EXAMPLE 8 - PAR TIAL PURIFICATION OF H3HA2 FUSION PROTEINS 
L. coli cell pellets containing the recombinant 
polypeptides , prepared as described in Example 6 , were 
stored at -70°c until used. coli cells were thawed 

5 and resuspended in lysis buffer A (50 mM Tris-HCl, 5% 
glycerol, 2 mM EDTA and 0.1 mM DTT, pH 8.0) at 10 
mL/gram. The stirred suspension was then treated with 
lysozyme (0.2 mg/mL) for 45 minutes at room temperature 
. and sonicated 2x for 2-3 minutes each time by a 

10 Sonicator. The resultant suspension was treated with 
0.1% DOC for 60 minutes at 4°C r then centrifuged at 
25,000 x g. The pellet was resuspended by sonication in 
50 mM glycine pH 10. 0, 5% glycerol, 2 mM EDTA and then 
the suspension was treated with 1% Triton X-100 [J.T. 

15 Baker Chemicals Co.] at 4°C for 60 minutes and 
centrifuged as above. 

The resulting pellet was solubilized in 50 mM 
Tris, 8 M urea, pH 8.0 and centrifuged to remove any 
insoluble material. This solubilized material is 

20 dialyzed against la'mM Tris, *1 mM EDTA, pH 8.0 followed, 
again, by centrif ugation of insoluble material. The 
solubilized material is designated as "crude" material 
and is used in in vitro and in vivo mouse assays. At 
this point, the material is approximately 40 - 50% pure. 
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The "crude" material was electrophoresed 
through an SDS-PAGE and the appropriate H3HA2 protein 
bands were visualized by KC1 staining according to D. 
Hager et al, Anal. Biochem . 109 :76-86 (1980). The band 
was cut-out and e luted electrophoretically by the "S&S 
Elutrap Electro-Separation System" [Schleicher & 
Schu'ell). The electro-eluting buffer was the Tris- 
glycine. A concentrated and eluted sample was obtained 
and exhaustively dialyzed against 0.01 N NH 4 HC0 3 and 0.02% 
SDS [M. Hunkapiller et al, Method. Enzvraol. , SI: 227-23 6 
(1983)]. This sample was frozen quickly by dry ice and 
lyophilized to complete dryness. The lyophilized 
material was brought back into solution using 50 mM Tris 
pH 8.0 and used for in vitro and in vivo mouse assays. 

Following this gel elution step, the protein is 
usually greater than 75% pure. 

EXAMPLE 9 - H3 SUBTYPE HETE ROLOGOUS PROTECTION ELICITED 
BY VACCINATION WITH NS1 (| ft n H3HA2 n m) [SEQ ID NO: 10] 

Mice (NIH/Swiss; 15' per group) were vaccinated 
subcutaneously with 50 or 10 tig KSl (l4l) H3BL2 042l)IBQ1DIIO:9A i a , 
in aluminum hydroxide on days 0 and 21. The mice were 
boosted intraperitoneally on day 42 with the protein 
without adjuvant. On day 47, mice were challenged 
intranasally with 2-3 LD W doses of either A/ PR/ 8/34 
(H1N1) or A/HK/68 (H3N2) virus, and survival was 
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monitored through day 21. This represents a heterologous 
challenge (A/PR/ 8/3 4) and an H3 heterosubtypic challenge, 
since the NS1 0 . 81) H3HA2 0 .22!) construct [SEQ ID NO: 9 & 10] was 
derived from A/Udorn/72 cDNA. The control group received 
adjuvant (CFA) only. 

The results in Table l below show that survival 
in mice vaccinated with NS1 (M1) H3HA2 0 . 22J) [SEQ ID NO: 10] and 
challenged with A/HK/68 (80-93%) was significantly higher 
than in control mice which were injected with adjuvant 
only (26% survival). In contrast, vaccination with NS1 (I . 
sl) H3HA2 n . ai> [SEQ ID NO: 10] did not confer protection 
against challenge with* A/PR/8/34 , an H1N1 strain (0-26% 
survival) . Thus protection elicited by NSl (MI) H3HA2 (I . ni) 
[SEQ ID NO: 10] is selective for antigenically diverse 
virus strains within the H3 subtype. 

Likewise, vaccination with the D protein 
( NS l(wi)HA2 (65 . m) [SEQ ID NO: 18]', derived from the H1N1 
subtype) elicits protection from heterosubtypic challenge 
with H1N1, but not the H3N2 subtype [S Dillon et al, 
Nature, in press (1992); Mbawuike et al, Faseb. J. . 
5:A1362 (abs. 5749 and Table 1]. These results in 
outbred mice also suggest that the response to the HI and 
H3 proteins will not be restricted to a limited number of 
individuals with certain major histocompatibility 
alleles, and therefore the vaccine will be effective in a 
majority of individuals. 

38 
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Table l 

Percent Survival After Challenge: 



- 5 


xsuDunizauion 


HA 

Subtype 


A/PR/8/34 
(H1N1) 


A/HK/68 
(H3N2) 


* 


50 /ig NSl„ n H3HA2 M11 


H3 


26 


80* 




10 /ig NS1^H3HA2„, 


H3 


0 


93* 




10 /ig NSl.^iA^ 


HI 


67* 


13 




A/HK/68 virus 


H3 


60* 


100* 


10 . 


Control (Al +3 ) 


y 


0 


26 



p < 0.05 vs. control in Fishers exact probability test 



Vaccination of mice with live homologous 
(A/HK/68) virus provided complete or partial protection, 

15 reflecting protection mediated by neutralizing antibody 

(homologous H3N2 challenge) and/ or CTL (heterologous H1N1 
challenge), respectively. 

Duration of protective immunity was tested by 
immunizing mice subcutaneous ly with the recombinant 

20 influenza protein plus adjuvant on days 0 and 21. Some 

mice were also given an ip injection of the protein 
(without adjuvant) on day 42.' Mice were challenged with 
A/HK/68 (H3N2) on day 47, four weeks after the second 
injection. Control mice were immunized as described 

25 above for Table 1, where an ip injection was given at 

week 6 (5 days prior to challenge) . The results in Table 
2 show that CB6F, mice (15 per group) were significantly 
protected when challenged with the A/HK/68 heterologous 
H3 virus strain 5-28 days after the last injection. 
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Table 2 





Dose (nq per injection) 




Injection 


Percent 




of NSl. ,,H3HA2, „, Adiuvant 


Schedule 


Survival 


5 


50 jig 


CFA 


0,21 


86* 




50 nq 


CFA 


0,21,42 


100* 




0 Mg 


CFA 


0,21 






50 fiq 


Al+ J 


0,21 


93* 




50 fig 


Al +J 


0,21,42 


93* 


10 


0 fig 


Al*» 


0,21 


0 



*p < 0.05 v. control in Fisher's exact probability test 



EXAMPLE 10 - TYPE A CROSS-PROTECTION WITH D AND H3C13 
PROTEIN 

15 Mice (CB6F n were divided randomly into six 

groups, with fifteen in each group. The mice were 
injected subcutaneous ly with proteins in Al* 3 (100 /ig) on 
days 0 and 21, and then were challenged with 2-3 LD 50 
doses of virus on day 49. Survival was monitored through 

20 day 21. The results of this study are illustrated in 

Table 3 below. For convenience, NS1,. S ,H3HA2,. 22 , is referred 
to as H3C13 in the table below. 
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Table 3 

Percent Survival After Challenge with: 



4 5 - 




Imxnuni z at i on 


HA 

SubtVDG 


A/PR/8/34 


A/HK/68 


• 


x. 


50 aa H3C13 
50 /ig D 


HI 


73* 


73* 


10 


2. 


10 fig H3C13 
. 10 fig D 


H3 
HI 


67* 


100* 




3. 


1 /ig H3C13 
1 tig D 


H3 
HI 


86* 


73* 




4. 


50 M9 H3C13 


H3 


7 


73* 




5. 


50 fig D 


HI 


47- 


7 


15 


6. 


Al +3 control 




7 


0 



* p < 0.001 vs. control group 
** p < 0.03 vs. control group 

This data demonstrates that mice immunized with 

20 a mixture of the D protein and H3C13 protein in aluminum 

adjuvant were protected against challenge with either 

A/PR/8/34 (HI) or A/HK/68 (H3) virus. In contrast, mice 

immunized with the D protein were protected against HI 

but not H3 challenge. Likewise, mice immunized with the 

25 H3C13 protein were protected against the H3 but not the 

HI challenge. Therefore, the combination of the D 

protein and the H3C13 proteins elicited protection 

against the currently circulating subtypes of influenza A 

virus. Thus, this combination represents a subtype 

30 cross-protective vaccine. 
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Numerous modifications and variations of the 
present invention are included in the above-identified 
specification and are expected to be obvious to one of 
skill in the art- Such modifications and alterations to 
the compositions and processes of the present invention 
are believed to be encompassed in the scope of the claims 
appended hereto. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Shatzman, Allan 
Scott, Miller 
Dillon, Susan B. 

(ii) TITLE OF INVENTION: Vaccinal Polypeptides 
(iii) NUMBER OF SEQUENCES: 42 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SmithKline Beecham Corporation - Corpo: 

Patents 

(B) STREET: U.S. Mailcode VW2220 - 709 Swedeland Road 

(C) CITY: King of Prussia 

(D) STATE: Pennsylvania 

(E) COUNTRY: USA 

(F) ZIP: 19406-2799 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Canter, Carol G. 

(B) REGISTRATION NUMBER: 31,151 

(C) REFERENCE/DOCKET NUMBER: SBC14224-8 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 215-270-5013 

(B) TELEFAX: 215-270-5090 



{2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 666 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..663 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

GGC ATA TTC GGC GCA ATA GCA GGT TTC ATA GAA AAT GGT TGG GAG GGA 48 
Gly lie Phe Gly Ala lie Ala Gly Phe lie Glu Aen Gly Trp Glu Gly 
1 5 10 15 

ATG ATA GAC GGT TGG TAC GGT TTC AGG CAT CAA AAT TCT GAG GGC ACA 96 
Met He Aep Gly Trp Tyr Gly Phe Arg Hie Gin Aen Ser Glu Gly Thr 
20 25 30 

GGA CAA GCA GCA GAT CTT AAA AGC ACT CAA GCA GCC ATC GAC CAA ATC 144 
Gly Gin Ala Ala Asp Leu Lys Ser Thr Gin Ala Ala He Asp Gin He 
35 40 45 

AAT GGG AAA CTG AAT AGG GTA ATC GAG AAG ACG AAC GAG AAA TTC CAT 192 
Asn Gly Lye Leu Asn Arg Val He Glu Lys Thr Asn Glu Lys Phe His 
50 55 60 

CAA ATC GAA AAG GAA TTC TCA GAA GTA GAA GGG AGA ATT CAG GAC CTC 240 
Gin He Glu Lye Glu Phe Ser Glu Val Glu Gly Arg He Gin Asp Leu 
65 70 75 80 

GAG AAA TAC CTT GAA GAC ACT AAA ATA GAT CTC TGG TCT TAC AAT GCC 288 
Glu Lye Tyr Val Glu Asp Thr Lye He Asp Leu Trp Ser Tyr Asn Ala 
85 SO 95 

GAG CTT CTT GTC GCT CTG GAG AAC CAA CAT ACA ATT GAT CTG ACT GAC 336 
Glu Leu Leu Val Ala Leu Glu Asn Gin His Thr He Asp Leu Thr Asp 
100 105 110 

TCG GAA ATG AAC AAA CTG TTT GAA AAA ACA AGG AGG CAA CTG AGG GAA 384 
Ser Glu Met Asn Lys Leu Phe Glu Lys Thr Arg Arg Gin Leu Arg Glu 
115 120 125 

AAT GCT GAG GAC ATG GGC AAT GGT TGC TTC AAA ATA TAC CAC AAA TCT 432 
Asn Ala Glu Asp Met Gly Asn Gly Cys Phe Lys He Tyr His Lys Cys 
130 135 140 

GAC AAT GCT TGC ATA GGG TCA ATC AGA AAT GGG ACT TAT GAC CAT GAT 480 
Asp Asn Ala Cys He Gly Ser He Arg Asn Gly Thr Tyr Asp His Asp 
145 150 155 160 

GTA TAC AGA GAC GAA GCA TTA AAC AAC CGC TTT CAG ATC AAA GGT GTT 528 
Val Tyr Arg Asp Glu Ala Leu Asn Asn Arg Phe Gin He Lys Gly Val 
165 170 . 175 

GAA CTC AAG TCA GGA TAC AAA GAC TGC ATC CTG TGG ATT TCC TTT GCC 576 
Glu Leu Lys Ser Gly Tyr Lys Asp Trp He Leu Trp He Ser Phe Ala 
180 185 190 

ATA TCA TGC TTT TTG CTT TCT GTT GTT TTG CTG GGG TTC ATC ATG TGG 624 
He Ser Cys Phe Leu Leu Cys Val Val Leu Leu Gly Phe He Met Trp 
195 200 205 

GCC TGC CAG AAA GGC AAC ATT AGG TGC AAC ATT TGC ATT TGA 666 
Ala Cys Gin Lys Gly Asn He Arg Cys Asn He Cys He 
210 215 220 
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(2) INFORMATION FOR SEQ ID NO; 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 221 amino acide 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Gly He Phe Gly Ala He Ala Gly Phe He Glu Afin Cly Trp Glu Gly 
1 5 10 is 

Met He Asp Gly Trp Tyr Gly Phe Arg Hia Gin Aon Ser Glu Gly Thr 
20 25 30 

Gly Gin Ala Ala Asp Leu Lys Ser Thr Gin Ala Ala He Asp Gin He 
35 40 45 

Asn Gly Lys Leu Asn Arg Val He Glu Lye Thr Asn Glu Lys Phe His 
50 55 60 

Gin He Glu Lyo Glu Phe Ser Glu Val Glu Cly Arg He Gin Asp Leu 

65 70 75 80 

■ «e 

Glu Lys Tyr Val Glu Asp Thr Lye He Asp Leu Trp Ser Tyr Asn Ala 
B5 90 95 

Glu Leu Leu Val Ala Leu Glu Asn Gin Hie Thr He Asp Leu Thr Asp 
100 105 ^10 

Ser Glu Met Asn Lys Leu Phe Glu Lys Thr Arg Arg Gin Leu Arg Glu 
115 120 125 

Asn Ala Glu Asp Met Gly Asn Cly Cys Phe Lys He Tyr His Lys Cys 
130 135 140 

Asp Asn Ala Cys He Gly Ser He Arg Asn Gly Thr Tyr Asp Eis Asp 
1^5 150 155 160 

Val Tyr Arg Asp Glu Ala Leu Asn Asn Arg Phe Gin He Lys Gly Val 
165 170 175 

Glu Leu Lys Ser Gly Tyr Lys Asp Trp He Leu Trp He Ser Phe Ala 
180 185 190 

He Ser Cys Phe Leu Leu Cys Val Val Leu Leu Gly Phe He Met Trp 
195 200 205 

Ala Cys Gin Lys Gly Asn He Arg Cys Asn He Cys He 
210 215 220 
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{2} INFORMATION FOR SEQ ID NO:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 666 base paira 
<B) TYPE: nucleic acid 

(C) STRANDED NESS : double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..663 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GGC ATA TTC GGC GCA ATA CCA GGT TTC ATA GAA AAT GGT TGG GAG GGA 48 
Gly lie Phe Gly Ala lie Ala Gly Phe He Glu Asn Gly Trp Glu Gly 
1 5 10 15 

ATG ATA GAC GGT TGG TAC GGT TTC AGG CAT CAA AAT TCC GAG GGC ACA 96 
Met He Asp Gly Trp Tyr Gly Phe Arg His Gin Asn Ser Glu Gly Thr 
20 25 30 

GGA CAA GCA CCA GAT CTT AAA AGC ACT CAA GCA GCC ATC GAC CAA ATC 144 
Gly Gin Ala Ala Asp Leu Lye Ser Thr Gin Ala Ala He Asp Gin He 
35 40 45 

AAT GGG AAA CTG AAT AGG GTA ATC GAG AAG ACG AAC GAG AAA TTC CAT 192 
Asn Gly Lys Leu Asn Arg Val He Glu Lys Thr Asn Glu Lye Phe Hie 
50 55 60 

CAA ATC GAA AAG GAA TTC TCA GAA GTA GAA GGG ACA ATT CAG GAC CTC 240 
Gin lie Glu Lys Glu Phe Ser Glu Val Glu Gly Arg He Gin Asp Leu 
65 70 75 80 

GAG AAA TAC CTT GAA GAC ACT AAA ATA GAT CTC TGG TCT TAC AAT GCG 288 
Glu Lys Tyr Val Glu Asp Thr Lys He Asp Leu Trp Ser Tyr Asn Ala 
85 90 • 95 

GAC CTT CTT GTC GCT CTG GAG AAC CAA CAT ACA ATT CAT CTG ACT GAC 336 
Glu Leu Leu Val Ala Leu Glu Asn Gin His Thr He Asp Leu Thr Asp 
100 105 HO 

TCG GAA ATG AAC AAA CTG TTT GAA AAA ACA AGG AGG CAA CTG AGG GAA 384 
Ser Glu Met Asn Lys Leu Phe Glu Lys Thr Arg Arg Gin Leu Arg Glu 
115 120 125 

AAT GCT GAG GAC ATG GGC AAT GGT TGC TTC AAA ATA TAC CAC AAA TGT 432 
Asn Ala Glu Asp Met Gly Asn Gly Cys Phe Lys He Tyr His Lys Cys 
130 135 140 

GAC AAT GCT TGC ATA GGG TCA ATC AGA AAT GGG ACT TAT GAC CAT CAT 480 
Asp Asn Ala Cys lie Gly Ser lie Arg Asn Gly Thr Tyr ABp His Asp 
US 150 155 160 

GTA TAC AGA CAC GAA CCA TTA AAC AAC CGG TTT CAG ATC AAA GGT GTT 528 
Val Tyr Arg Asp Glu Ala Leu Asn Asn Arg Phe Gin He Lys Gly Val 
165 170 175 

GAA CTG AAG TCA GGA TAC AAA GAC TGC ATC CTC TGG ATT TCC TTT CCC 576 
Glu Leu Lys Ser Gly Tyr Lye Asp Trp He Leu Trp He Ser Phe Ala 
180 185 190 
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ATA TCA TCC TTT TTC CTT TGT GTT GTT TTC CTC CCC TTC ATC ATC TGG 
lie Ser Cye Phe Leu Leu Cys Val Val Leu Leu Gly Phe lie Met Trp 
195 200 205 



624 



CCC TGC CAA AAA GGC AAC ATT AGC TGC AAC ATT TCC ATT TGA 666 
Ala Cye Gin Lys Gly Asn lie Arg Cy« Aen He Cye He 
210 215 220 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 221 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Gly He Phe Gly Ala He Ala Gly Phe He Glu Asn Gly Trp Glu Gly 
1 5 10 is 

Met He Asp Gly Trp Tyr Gly Phe Arg His Gin Asn Ser Glu Gly Thr 
20 25 30 

Gly Gin Ala Ala Asp Leu Lys Ser Thr Gin Ala Ala He Asp Gin He 
35 40 45 

Asn Gly Lys Leu Asn Arg Val He Glu Lys Thr Asn Glu Lys Phe His 
50 55 60 

Gin He Glu Lys Glu Phe Ser Glu Val Glu Gly Arg He Gin Asp Leu 
65 70 75 60 

Glu Lys Tyr Val Glu Asp Thr Lys He Asp Leu Trp Ser Tyr Asn Ala 
85 90 95 

Glu Leu Leu Val Ala Leu Glu Asn Gin His Thr He Asp Leu Thr Asp 
100 105 no 

Ser Glu Met Asn Lys Leu Phe Glu Lys Thr Arg Arg Gin Leu Arg Glu 
115 120 125 

Asn Ala Glu Asp Met Gly Asn Gly Cys Phe Lys He Tyr His Lys Cys 
130 135 140 

Asp Asn Ala Cys He Gly Ser He Arg Asn Gly Thr Tyr Asp His Asp 

"5 ISO 155 > 160 

Val Tyr Arg Asp Glu Ala Leu Asn Asn Arg Phe Gin He Lys Gly Val 
165 170 175 

Glu Leu Lys Ser Gly Tyr Lys Asp Trp He Leu Trp He Ser Phe Ala 
180 185 190 

He Ser Cys Phe Leu Leu Cys Val Val Leu Leu Gly Phe He Met Trp 
195 200 205 

Ala Cys Gin Lys Gly Asn He Arg Cys Asn He Cys He 
210 215 220 
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(2) INFORMATION FOR SEQ ID NO: 5; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 670 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..666 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:5: 

GGT CTA TTT GGA GCC ATT GCC GGT TTT ATT GAA GGG GGA TGG ACT GGA 48 
Gly Leu Phe Gly Ala lie Ala Gly Phe He Glu Gly Gly Trp Thr Gly 
1 5 10 is 

ATG ATA GAT GGA TGG TAC GGT TAT CAT CAT CAG AAT GAA CAG GGA TCA 96 
Met He Asp Gly Trp Tyr Gly Tyr His His Gin Asn Glu Gin Gly Ser 
20 25 30 

GGC TAT GCA GCG GAT CAA AAA AGC ACA CAA AAT GCC ATT AAC GGG ATT 144 
Gly Tyr Ala Ala Asp Gin Lys Ser Thr Gin Asn Ala He Asn Gly lie 
35 40 45 

ACA AAC AAG CTG AAC TCT CTT ATC GAG AAA ATG AAC ATT CAA TTC ACA 192 
Thr Asn Lys Val Asn Ser Val He Glu Lys Met Asn He Gin Phe Thr 
50 5 5 60 

OCT GTG GGT AAA GAA TTC AAC AAA TTA CAA AAA ACG ATG CAA AAT TTA 240 
Ala Val Gly Lys Glu Phe Asn Lys Leu Glu Lys Arg Met Glu Asn Leu 
65 70 75 80 

AAT AAA AAA CTT GAT GAT GGA TTT CTG GAC ATT TGG ACA TAT AAT CCA 268 
Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp Thr Tyr Asn Ala 
85 90 • 95 

GAA TTG TTA GTT CTA CTG CAA AAT GAA AGG ACT CTG GAT TTC CAT GAC 336 
Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu Asp Phe His Asp 
100 105 HO 

TCA AAT GTG AAG AAT CTG TAT GAG AAA GTA AAA AGC CAA TTA AAG AAT 384 
Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser Gin Leu Lys Asn 
115 120 125 

AAT GCC AAA GAA ATC GGA AAT GGA TCT TTT GAG TTC TAC CAC AAG TCT 432 
Asn Ala Lys Glu lie Gly Asn Gly Cys Phe Glu Phe Tyr His Lys Cys 
130 135 140 

GAC AAT GAA TGC ATC GAA AGT GTA AGA AAT GGG ACT TAT GAT TAT CCC 480 
Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr Tyr Asp Tyr Pro 
"5 150 155 160 

AAA TAT TCA GAA GAG TCA AAG TTG AAC AGG GAA AAG GTA GAT GGA GTG 528 
Lys Tyr Ser Glu Glu Ser Lye Leu Asn Arg Glu Lys Val Asp Gly Val 
165 170 175 
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AAA TTG CAA TCA ATG CCC ATC TAT CAG ATT CTG GCG ATC TAC TCA ACT 576 

Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala He Tyr Ser Thr 
180 185 190 

GTC GCC AGT TCA CTG GTG CTT TTG CTC TCC CTG GGG GCA ATC AGT TTC 624 

Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly Ala He Ser Phe 

195 200 205 

TGG ATG TGT TCT AAT GGA TCT TTC CAG TCC AGA ATA TGC ATC 666 

Trp Met Cye Ser Asn Gly Ser Leu Gin Cye Arg He Cye He 
210 215 220 

TGAG 670 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 222 amino acids 

(B) TYPE: amino acid 
(0) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Gly Leu Phe Gly Ala He Ala Gly Phe He Glu Gly Gly Trp Thr Gly 
1 5 10 15 

Met lie Asp Gly Trp Tyr Gly Tyr Hie Hie Gin Asn Glu Gin Gly Ser 
20 25 30 

Gly Tyr Ala Ala Asp Gin Lys Ser Thr Gin Asn Ala He Asn Gly He 
35 40 45 

Thr Asn Lys Val Asn Ser Val He Glu Lys Met Asn He Gin Phe Thr 
50 55 60 

Ala Val Gly Lys Glu Phe Asn Lys Leu Glu Lys Arg Met Glu Asn Leu 
65 70 75 80 

Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp Thr Tyr Asn Ala 
85 90 95 

Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu Asp Phe His Asp 
100 105 110 

Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser Gin Leu Lys Asn 
115 120 125 

Asn Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe Tyr His Lys Cys 
130 135 140 

Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr Tyr Asp Tyr Pro 
145 150 155 160 

Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys Val Asp Gly Val 
165 170 175 
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Lys Leu Glu Ser Met Gly lie Tyr Cln lie Leu Ala lie Tyr Ser Thr 
180 185 190 

Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly Ala lie Ser Phe 
195 200 205 

Trp Met Cys Ser Asn Gly Ser Leu Cln Cys Arg He Cys lie 
210 215 220 



(2) INFORMATION FOR SEQ ID NO:7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 670 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: l.,670 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



° CGCATATTCC 


CCGCAATAGC 


AGGTTTCATA 


GAAAATGGTT 


GGGAGGGAAT 


GATAGACGGT 


60 


TGGTACGGTT 


TCAGGCATCA 


AAATTCNGAG 


GGCACAGGAC 


AAGCAGCAGA 


TCTTAAAAGC 


120 


ACTCAAGCAG 


CCATCGACCA 


AATCAATGGG 


AAACTGAATA 


GGGTAATCGA 


GAAGACGAAC 


180 


GAGAAATTCC 


ATCAAATCGA 


AAAGGAATTC 


TCACAAGTAG AAGGGACAAT TCAGGACCTC 


240 


GAGAAATACG 


TTGAAGACAC 


TAAAATAGAT 


CTCTGGTCTT 


ACAATG CGG A 


GCTTCTTGTC 


300 


CCTCTGGAGA 


ACCAACATAC 


AATTGATCTG 


ACTGACTCGG 


AAATGAACAA 


ACTCTTTGAA 


360 


AAAACAAGGA 


GGCAACTGAG 


GGAAAATGCT 


GAGGACATGG 


GCAATGGTTG 


CTTCAAAATA 


420 


TACCACAAAT 


GTGACAATGC 


TTGCATAGGG 


TCAATCAGAA 


ATGGGACTTA 


TGACCATGAT 


480 


GTATACAGAG 


ACGAAGCATT 


AAACAACCGG 


TTTCAGATCA 


AAGGTGTTGA 


ACTGAAGTCA 


540 


GGATACAAAG 


ACTGGATCCT 


GTGGATTTCC 


TTTCCCATAT 


CATGCTTTTT 


GCTTTCTGTT 


600 


GTTTTGCTGG 


GGTTCATCAN 


NNTGTGGGCC 


TGCCANAAAG 


GCAACATTAG 


GTGCAACATT 


660 



TGCATTTCAN 670 



(2) INFORMATION FOR SEQ ID NO:8: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 222 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:B: 

Gly lie Phe Gly Ala lie Ala Cly Phe lie Glu Asn Gly Trp Glu Gly 
1 5 10 15 

Met lie Aep Gly Trp Tyr Gly Phe Arg His Gin Aen Ser Glu Gly Thr 
20 25 30 

Gly Gin Ala Ala Asp Leu Lys Ser Thr Gin Ala Ala He Asp Gin He 
35 40 45 

Asn Gly Lye Leu Asn Arg Val He Glu Lya Thr Asn Glu Lys Phe His 
SO 55 60 

Gin He Glu Lye Glu Phe Ser Glu Val Glu Gly Arg He Gin Asp Leu 
65 70 -75 80 

Glu Lys Tyr Val Glu Asp Thr Lys He Asp Leu Trp Ser Tyr Asn Ala 
85 90 95 

Glu Leu Leu Val Ala Leu Glu Asn Gin His Thr He Asp Leu Thr Asp 
100 105 110 

Ser Glu Met Asn Lys Leu Phe Glu Lys Thr Arg Arg Gin Leu Arg Glu 
US 120 125 

Asn Ala Glu Asp Met Gly Asn Gly Cys Phe Lys He Tyr His Lys Cys 
130 135 140 

Asp Asn Ala Cys He Gly Ser He Arg Asn Cly Thr Tyr Asp His Asp 
"5 ISO 155 160 

Val Tyr Arg Asp Glu Ala Leu Asn Asn Arg Phe Gin He Lys Gly Val 
165 170 175 

Glu Leu Lys Ser Xaa Gly Tyr Lys Asp Trp He Leu Trp He Ser Phe 
180 185 190 

Ala He Ser Cys Phe Leu Leu Cys Val Val Leu Leu Gly Phe He Met 
195 200 205 

Trp Ala Cys Gin Lys Gly Asn He Arg Cys Asn He Cys He 
210 215 220 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LEN GTH; 918 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 



(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..918 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA GAT TGC TTT CTT TGG 48 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
1 5 10 15 

CAT GTC CGC AAA CGA CTT CCA GAC CAA GAA CTA CGT GAT GCC CCA TTC 96 
Hie Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC CTA AGA GGA AGG GGC AGC 144 
Leu Asp Arg Leu Arg Arg Asp Gin Lye Ser Leu Arg Gly Arg Gly Ser 
35 40 45 

ACT CTT GGT CTG GAC ATC GAG ACA GCC ACA CGT CCT GGA AAG CAG ATA 192 
Thr Leu Gly Leu Asp lie Glu Thr Ala Thr Arg Ala Gly Lye Gin lie 
50 55 60 

GTG GAG CGC ATT CTG AAA GAA GAA TCC GAT GAG GCA CTT AAA ATG ACC 240 
Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 80 

ATG GGC GCC CAT ATG GGC ATA TTC GGC GCA ATA CCA GGT TTC ATA GAA 288 
Met Gly Ala His Met Gly He Phe Gly Ala He Ala Gly Phe He Glu 
85 90 95 

AAT GGT TGG GAG GGA ATG ATA GAC GGT TGG TAC GGT TTC AGG CAT CAA 336 
Asn Gly Trp Glu Gly Met lie Asp Gly Trp Tyr Gly Phe Arg His Gin 
100 105 110 

AAT TCT GAG GGC ACA GGA CAA GCA GCA GAT CTT AAA AGC ACT CAA GCA 384 
Asn Ser Glu Gly Thr Gly Gin Ala Ala Asp Leu Lys Ser Thr Gin Ala 
115 120 125 

GCC ATC GAC CAA ATC AAT GGG AAA CTG AAT AGG GTA ATC GAG AAG ACG 432 
Ala He Asp Gin He Asn Gly Lys Leu Asn Arg Val He Glu Lys Thr 
130 135 140 

AAC GAG AAA TTC CAT CAA ATC GAA AAG CAA TTC TCA GAA CTA CAA GGG 480 
Asn Glu Lys Phe His Gin He Glu Lys Glu Phe Ser Glu Val Glu Gly 
145 150 155 160 

AGA ATT CAG GAC CTC GAG AAA TAC GTT GAA GAC ACT AAA ATA GAT CTC 528 
Arg He Gin Asp Leu Glu Lys Tyr Val Glu Asp Thr Lys He Asp Leu 
165 170 175 

TGG TCT TAC AAT GCG GAG CTT CTT CTC OCT CTC GAC AAC CAA CAT ACA 576 
Trp Ser Tyr ABn Ala Glu Leu Leu Val Ala Leu Glu Asn Gin His Thr 
180 185 190 

ATT GAT CTG ACT GAC TCG GAA ATG AAC AAA CTG TTT GAA AAA ACA AGG 624 
He Asp Leu Thr Asp Ser Glu Met Asn Lys Leu Phe Glu Lys Thr Arg 
195 200 205 

AGG CAA CTG AGG GAA AAT CCT GAG GAC ATG CGC AAT GGT TGC TTC AAA 672 
Arg Gin Leu Arg Glu Asn Ala Glu ABp Met Gly Asr Gly Cye Phe Lys 
210 215 220 

ATA TAC CAC AAA TGT GAC AAT GCT TGC ATA GGG TCA ATC AGA AAT GGG 720 
He Tyr His Lys Cys Asp Asn Ala Cys He Gly Ser He Arg Asn Gly 
225 230 235 240 
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Ste'SJ So SK 5If ^° CAC CAA CCA »* AAC AAC COG TTT 768 

Thr Tyr Asp Hxs Asp Val Tyr Arg Asp Glu Ala Leu Asn Asn Arg Phe 
245 2S0 2SS 



ill i£ S3 SIT £* f™ f*° TCA TAC *** eAC A« CTC 816 

Gin lie Lys Gly Val Clu Leu Lys Ser Gly Tyr Lys Asp Trp lie Leu 

260 265 270 



?™ rTL III IT ? f C ^ TCC TTT TTC CTT TCT CTT CTT TTG CTC 864 

Trp lie Ser Phe Ala lie Ser Cys Phe Leu Leu Cys Val Val Leu Leu 
275 280 285 

HI t? C t^l 2°® ?? C « C <»A AAA GGC AAC ATT AGO TCC AAC ATT 912 
ly I ^ e , Met Tr P Ala Lys Gly Aen He Arg Cys Asn He 

Z90 295 300 

TGC ATT a „ 
Cys He 918 
305 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 306 amino acids 

(B) TYPE: amino acid 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 

His Val Arg Lye Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 * 30 

Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 

Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
50 55 60 

Val Clu Arg He Leu Lye Glu Clu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 80 

Met Gly Ala His Met Gly He Phe Gly Ala He Ala Gly Phe He Glu 
85 90 95 

Asn Gly Trp Glu Gly Met He Asp Gly Trp Tyr Gly Phe Arg His Gin 
100 105 no 

Asn Ser Glu Gly Thr Gly Gin Ala Ala Asp Leu Lys Ser Thr Gin Ala 
115 120 125 

Ala He Asp Gin He Asn Gly Lys Leu Asn Arg Val lie Glu Lys Thr 
130 135 140 

Asn Glu LyB Phe His Gin He Clu Lys Glu Phe Ser Glu Val Glu Gly 
J45 150 155 160 
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Arg lie Gin Asp Leu Glu Lye Tyr Val Glu Asp Thr Lye He Asp Leu 
165 170 175 

Trp Ser Tyr Asn Ala Glu Leu Leu Val Ala Leu Glu Asn Gin Hie Thr 
180 185 190 

He Aep Leu Thr Aep Ser Glu Met Asn Lye Leu Phe Glu Lys Thr Arg 
195 200 205 

Arg Gin Leu Arg Glu Aen Ala Glu Aep Het Gly Asn Gly Cys Phe Lye 
210 215 220 

He Tyr His Lys Cys Asp Asn Ala Cye He Gly Ser He Arg Asn Gly 
225 230 235 240 

Thr Tyr Asp His Asp Val Tyr Arg Asp Glu Ala Leu ABn Asn Arg Phe 
245 250 255 

Gin He Lye Gly Val Glu Leu Lys Ser Gly Tyr Lys Aep Trp He Leu 
260 265 270 

Trp He Ser Phe Ala lie Ser Cys Phe Leu Leu Cys Val Val Leu Leu 
275 280 285 

Gly Phe He Met Trp Ala Cys Gin Lys Gly Asn He Arg Cys Asn He 
290 295 300 

Cys He 
305 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 690 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1. .690 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

ATG GAT CCA AAC ACT GTG TCA ACC TTT CAG GTA CAT TGC TTT CTT TCG 48 
Het Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
1 5 10 15 

CAT GTC CGC AAA CGA GTT CCA CAC CAA GAA CTA GGT GAT GCC CCA TTC 96 
His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

CTT CAT CGG CTT CGC CGA GAT CAG AAA TCC CTA AGA GGA AGG CGC AGC 144 
Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 

ACT CTT GGT CTG CAC ATC GAG AGA GCC ACA CGT GCT GGA AAG CAC ATA 192 
Thr Leu Gly Leii Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
50 55 60 
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GTG GAG CGC ATT CTG AAA GAA GAA TCC GAT GAG GCA CTT AAA ATC ACC 240 
Val Glu Arg He Leu Lye Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 60 

ATG GAT CAT ATG TTA ATT CAG GAC CTC GAG AAA TAC GTT GAA GAC ACT 288 
Met Asp His Met Leu He Gin Asp Leu Glu Lys Tyr Val Glu Asp Thr 
85 90 95 

AAA ATA GAT CTC TGG TCT TAC AAT CCG GAG CTT CTT GTC GCT CTG GAG 336 
Lys He Asp Leu Trp Ser Tyr Asn Ala Glu Leu Leu Val Ala Leu Glu 
100 105 110 

AAC CAA CAT ACA ATT GAT CTG ACT GAC TCG GAA ATG AAC AAA CTG TTT 384 
Asn Gin His Thr He Asp Leu Thr Asp Ser Glu Met Asn Lys Leu Phe 
115' 120 125 

GAA AAA ACA AGG AGG CAA CTG AGG GAA AAT GCT GAG GAC ATG GGC AAT 432 
Glu Lys Thr Arg Arg Gin Leu Arg Glu Asn Ala Glu Asp Met Gly Asn 
130 135 140 

GGT TGC TTC AAA ATA TAC CAC AAA TGT GAC AAT GCT TGC ATA GGG TCA 480 
Gly Cys Phe Lys He Tyr Hie Lys Cys Asp Asn Ala Cys He Gly Ser 
145 150 155 160 

ATC AGA AAT GGG ACT TAT GAC CAT GAT GTA TAC AGA GAC GAA GCA TTA 526 
He Arg Asn Gly Thr Tyr Asp His Asp Val Tyr Arg Asp Glu Ala Leu 
165 170 175 

AAC AAC CGG TTT CAG ATC AAA GGT GTT GAA CTG AAG TCA GGA TAC AAA 576 
Asn Asn Arg Phe Gin He Lys Gly Val Glu Leu Lys Ser Gly Tyr Lys 
180 185 190 

GAC TGG ATC CTG TGG ATT TCC TTT GCC ATA TCA TGC TTT TTG CTT TGT 624 
Asp Trp He Leu Trp He Ser Phe Ala He Ser Cys Phe Leu Leu Cys 
195 200 205 

GTT GTT TTG CTG GGG TTC ATC ATG TCG GCC TGC CAA AAA GGC AAC ATT 672 
Val Val Leu Leu Gly Phe He Met Trp Ala Cys Gin Lys Gly Asn lie 
210 215 220 

AGG TGC AAC ATT TGC ATT 690 
Arg Cys Asn He Cys He 
225 230 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 230 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
1 5 10 15 

His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 
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Leu Asp Arg Leu Arg Arg Asp Cln Lye Ser Leu Arg Cly Arg Cly Ser 
35 40 45 

Thr Leu Gly Leu Aep He Glu Thr Ala Thr Arg Ala Gly Lye Gin He 
50 55 60 

Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lye Met Thr 
65 70 75 60 

Met ABp Hie Met Leu He Gin Asp Leu Glu Lys Tyr Val Glu Aep Thr 
85 90 95 

Lys He Aep Leu Trp Ser Tyr Aen Ala Glu Leu Leu Val Ala Leu Glu 
. 100 105 no 

Asn Gin His Thr He Asp Leu Thr Asp Ser Glu Met Asn Lys Leu Phe 
115 .120 125 

Glu Lys Thr Arg Arg Gin Leu Arg Glu Asn Ala Glu Asp Met Gly Asn 
130 135 140 

Gly Cys Phe Lys lie Tyr His Lys Cys Asp Asn Ala Cys He Gly Ser 
145 . 150 155 160 

He Arg Asn Gly Thr Tyr Asp His Asp Val Tyr Arg Asp Glu Ala Leu 
165 170 175 

Asn Asn Arg Phe Cln He Lys Gly Val Glu Leu Lye Ser Gly Tyr Lys 
180 185 190 

Asp Trp He Leu Trp He Ser Phe Ala He Ser Cys Phe Leu Leu Cys 
195 200 205 

Val Val Leu Leu Gly Phe He Met Trp Ala Cys Cln Lys Gly Asn He 
210 215 220 

Arg Cys Asn He Cys He 
225 230 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 699 base pairs 

( B ) TYPE : nucleic acid 

(C) STRANDEDNESS: double 

( D ) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..699 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

ATG GAT CCA AAC ACT CTC TCA ACC TTT CAC CTA GAT TCC TTT CTT TGG 48 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Ser Phe Leu Trp 
1 5 10 15 

CAT CTC CGC AAA CCA CTT GCA GAC CAA GAA CTA GOT GAT GCC CCA TTC 96 
His Val Arg Lye Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC ATC CAT GGA TCA TAT GTT 144 
Leu Asp Arg Leu Arg Arg Asp Gin Lya Ser Met His Gly Ser Tyr Val 
35 40 45 

?* 6 if* 5** *** C f T ATA ** c ** c *™ ACA AAA AAT CTC AAC TAT 192 
Asn Lye Thr Gin Glu Ala He Asn Lye He Thr Lye Aen Leu Asn Tyr 
50 55 60 

TTA AGT GAG CTA GAA CTA AAA AAC CTT CAA AGA CTA AGC GGA GCA ATG 240 
Leu Ser Glu Leu Glu Val Lye Asn Leu Gin Arg Leu Ser Gly Ala Met 
* s 70 75 80 

AAT GAG CTT CAC GAC GAA ATA CTC GAC CTA GAC GAA AAA GTG GAT GAT 288 
Asn Glu Leu His Asp Glu He Leu Glu Leu Asp Glu Lys Val Asp Asp 
^ 85 90 95 

CTA AGA OCT GAT ACA ATA AGC TCA CAA ATA GAG CTT CCA CTC TTG CTT 336 
Leu Arg Ala Asp Thr He Ser Ser Gin He Glu Leu Ala Val Leu Leu 
100 105 no 

TCC AAC GAA GGG ATA ATA AAC AGT GAA GAT GAG CAT CTC TTG GCA CTT 384 
Ser Asn Glu Gly He He Asn Ser Glu Asp Glu Bis Leu Leu Ala Leu 
115 120 125 

GAA AGA AAA CTG AAG AAA ATG CTT GCC CCC TCT OCT GTA GAA ATA CGG 432 
Glu Arg Lys Leu Lys Lys Met Leu Gly Pro Ser Ala Val Glu He Glv 
130 135 140 

AAT GGG TCC TTT GAA ACC AAA CAC AAA TCC AAC CAG ACT TCC CTA CAC 480 
Asn Gly Cye Phe Glu Thr Lya His Lys Cys Asn Gin Thr Cys Leu Asp 
145 ISO 155 160 

AGC ATA OCT GCT GGC ACC TTT AAT GCA GGA GAT TTT TCT CTT CCC ACT 528 
Arg He Ala Ala Gly Thr Phe Asn Ala Gly Asp Phe Ser Leu Pro Thr 
165 170 . 175 

TTT GAT TCA TTA AAC ATT ACT GCT GCA TCT TTA AAT GAT GAT GGC TTC 576 
Phe Asp Ser Leu Asn He Thr Ala Ala Ser Leu Asn Asp Asp Gly Leu 
1B0 185 190 

GAT AAT CAT ACT ATA CTG CTC TAC TAC TCA ACT GCT GCT TCT AGC TTG 624 
Asp Asn His Thr He Leu Leu Tyr Tyr Ser Thr Ala Ala Ser Ser Leu 
195 200 205 

GCT GTA ACA TTA ATC ATA GCT ATC TTC ATT CTC TAC ATC CTC TCC AGA 672 
Ala Val Thr Leu Met He Ala He Phe He Val Tyr Met Val Ser Ara 
210 215 220 

CAC AAT GTT TCT TGT TCC ATC TGT CTG 599 
Asp Asn Val Ser Cys Ser He Cys Leu 
225 230 
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(2) INFORMATION FOR SEQ ID NO: 14: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Asp Pro Asn Thr Val Ser Ser Phe Glh Val Asp Ser Phe Leu Trp 
1 5 10 15 

Hie Val Arg LyB Arg Val Ala Asp Gin Glu Leu Cly Asp Ala Pro Phe 
20 25 30 

Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Met His Gly Ser Tyr Val 
35 40 45 

Asn Lys Thr Gin Glu Ala lie Asn Lys lie Thr Lys Asn Leu Asn Tyr 
50 55 60 

Leu Ser Glu Leu Glu Val Lys Ash Leu Gin Arg Leu Ser Gly Ala Met 
65 70 75 80 

Asn Glu Leu His Asp Glu lie Leu Glu Leu Asp Glu Lys Val Asp Asp 
65 90 95 

Leu Arg Ala Asp Thr lie Ser Ser Gin He Glu Leu Ala Val Leu Leu 
100 105 HO 

Ser Asn Glu Gly He He Asn Ser Glu Asp Glu His Leu Leu Ala Leu 
115 120 125 

Glu Arg Lys Leu Lys Lys Met Leu Cly Pro Ser Ala Val Glu He Gly 
130 135 140 

Asn Gly Cys Phe Glu Thr Lys His Lys Cys Asn Gin Thr Cys Leu Asp 
145 150 155 160 

Arg lie Ala Ala Gly Thr Phe Asn Ala Gly Asp Phe Ser Leu Pro Thr 
165 170 175 

Phe Asp Ser Leu Asn He Thr Ala Ala Ser Leu Asn Asp Asp Gly Leu 
180 "185 190 

Asp Asn His Thr lie Leu Leu Tyr Tyr Ser Thr Ala Ala Ser Ser Leu 
195 200 205 

Ala Val Thr Leu Met lie Ala lie Phe He Val Tyr Met Val Ser Arg 
210 215 220 

Asp Asn Val Ser Cys Ser lie Cys Leu 
225 230 
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(2} INFORMATION FOR SEQ ID NO: IS: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 924 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..921 



ATC GAT CCA AAC ACT GTG TCA ACC TTT CAG CTA GAT TGC TTT CTT TGG 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp CyB Phe Leu Trp 
1 5 in 1 c 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

48 

F _ Tra 

5 10 15 

CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA GGT GAT GCC CCA TTC 96 
His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC CTA AGA GCA AGG GGC AGC 144 
Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arf Gly Arg Gly Ser 
35 40 45 

ACT CTT GGT CTG GAC ATC GAG ACA GCC ACA CGT OCT GGA AAG CAG ATA 192 
Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
SO 55 60 

GTG GAC CGG ATT CTG AAA GAA GAA TCC GAT GAG GCA CTT AAA ATG ACC 240 
Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
« 70 75 80 

ATG GAT CTG TCC AGA CGT CTA TTT GGA GCC ATT GCC GGT TTT ATT GAA 288 
Met Asp Leu Ser Arg Gly Leu Phe Gly Ala He Ala Gly Phe He Glu 
85 90 95 

GCG GGA TGG ACT GGA ATG ATA GAT GGA TGC TAC GGT TAT CAT CAT CAG 336 
Gly Gly Trp Thr Gly Met He Asp Gly Trp Tyr Gly Tyr His His Gin 
100 105 no 

AAT GAA CAG GGA TCA GGC TAT GCA GCG GAT CAA AAA AGC ACA CAA AAT 384 
Asn Glu Gin Gly Ser Gly Tyr Ala Ala Asp Gin Lys Ser Thr Gin Asn 
115 120 125 

GCC ATT AAC GGG ATT ACA AAC AAG GTG AAC TCT CTT ATC GAG AAA ATG 432 
Ala He Asn Gly He Thr Asn Lys Val Asn Ser Val He Glu Lys Met 
130 135 140 

AAC ATT CAA TTC ACA OCT GTG GGT AAA GAA TTC AAC AAA TTA GAA AAA 480 
Asn He Gin Phe Thr Ala Val Gly Lys Glu Phe Asn Lys Leu Glu Lys 
145 150 155 160 

AGG ATG GAA AAT TTA AAT AAA AAA CTT GAT GAT GGA TTT CTG GAC ATT 528 
Arg Met Glu Asn Leu Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He 
165 170 175 
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TGG ACA TAT AAT GCA GAA TTG TTA CTT CTA CTG GAA AAT GAA AGG ACT 576 
Trp Thr Tyr Aon Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr 
180 185 190 

CTG GAT TTC CAT GAC TCA AAT CTG AAC AAT CTG TAT GAG AAA GTA AAA 624 
Leu Asp Phe Hie Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lye 
195 200 205 

AGC CAA TTA AAG AAT AAT GCC AAA GAA ATC GGA AAT GGA TGT TTT GAG 672 
Ser Gin Leu Lye Asn Asn Ala Lye Glu He Gly Asn Gly Cye Phe Glu 
210 215 220 

TTC TAC CAC AAG TGT GAC AAT GAA TGC ATG GAA AGT GTA AGA AAT GGG 720 
Phe Tyr His Lys Cys Aep Asn Glu Cys Met Glu Ser Val Arg Asn Gly 
225 230 235 240 

ACT TAT GAT TAT CCC AAA TAT TCA GAA GAG TCA AAG TTC AAC AGG GAA 768 
Thr Tyr Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu 
245 250 255 

AAG GTA GAT GGA GTC AAA TTG GAA TCA ATG GGG ATC TAT CAG ATT CTG 816 
Lys Val Asp Gly Val Lys Leu Glu Ser Met Gly He Tyr Gin He Leu 
260 265 270 

GCG ATC TAC TCA ACT GTC GCC AGT TCA CTG CTG CTT TTG GTC TCC CTG 664 
Ala He Tyr Ser Thr Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu 
275 280 285 

GGG GCA ATC AGT TTC TGG ATG TGT TCT AAT GGA TCT TTG CAG TCC AGA 912 
Gly Ala lie Ser Phe Trp Met Cys Ser Asn Gly Ser Leu Gin Cye Arg 
290 295 300 

ATA TGC ATC TGA 924 

lie Cys He 

305 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 307 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
1 5 10 15 

His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 

Thr Leu Gly Leu Asp lie Glu Thr Ala Thr Arg Ala Gly Lys Gin lie 
50 55 60 

Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 80 
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Met Asp Leu Ser Arg Gly Leu Phe Gly Ala He Ala Gly Phe He Glu 

85 90 95 

Gly Gly Trp Thr Gly Met He Asp Gly Trp Tyr Gly Tyr Hie His Gin 

100 105 110 

Asn Glu Gin Gly Ser Gly Tyr Ala Ala Asp Gin Lys Ser Thr Gin Asn 

. ia5 120 125 

Ala lie Asn Gly He Thr Asn Lys Val Asn Ser Val He Glu Lys Met 



135 



140 



Asn He Gin Phe Thr Ala Val Gly Lys Glu Phe Asn Lys Leu Glu Lys 

145 * <* 



150 



155 



160 



Arg Met Glu Asn Leu Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He 
165 170 175 

Trp Thr Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr 
180 185 190 

Le.u Asp Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys 
195 200 205 

Ser Gin Leu Lys Asn Asn Ala Lys Glu He Gly Asn Gly Cys Phe Glu 



215 



220 



Phe Tyr His Lys Cys Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly 



230 



235 



240 



Thr Tyr Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu 
245 250 255 

Lys Val Asp Gly Val Lys Leu Glu Ser Met Gly He Tyr Gin He Leu 
260 265 270 

Ala He Tyr Ser Thr Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu 
275 280 285 

ply Ile Ser Phe Trp Met °y Ser A™ C1 Y Ser Leu Gin Cys Arg 

290 295 300 

lie Cys Ile 
305 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 729 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 1..726 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

ATG GAT CCA AAC ,ACT GTG TCA AGC TTT CAG GTA GAT TGC TTT CTT TGG 48 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
1 5 10 15 

CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA GCT GAT GCC CCA TTC 96 
Hxe Val Arg Lys Arg Val Ala Aep Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC CTA AGA GGA AGG GGC AGC 144 
Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 

ACT CTT GGT CTG GAC ATC GAG ACA GCC ACA CGT GCT GGA AAG CAG ATA 192 
Thr Leu Gly Leu Asp lie Glu Thr Ala Thr Arg Ala Gly Lye Gin He 
50 55 60 

GTG GAG CGG ATT CTG AAA GAA GAA TCC GAT GAG GCA CTT AAA ATG ACC 240 
Val Glu Arg He Leu Lye Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 80 

ATG CAG ATC CCG GCT CTG GCT AAA GAA TTC AAC AAA TTA GAA AAA AGG 288 
Met Gin He Pro Ala Val Cly Lys Glu Phe hen Lye Leu Glu Lys Arg 
85 90 95 

ATG GAA AAT TTA AAT AAA AAA GTT GAT CAT GGA TTT CTG GAC ATT TGG 336 
Met Glu Asn Leu Asn Ly6 Lys Val Asp Asp Gly Phe Leu Asp He Trp 
100 105 HO 

ACA TAT AAT GCA GAA TTC TTA CTT CTA CTG GAA AAT GAA AGG ACT CTG 384 
Thr Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu 
115 120 125 

GAT TTC CAT GAC TCA AAT GTG AAG AAT CTG TAT GAG AAA CTA AAA ACC 432 
Asp Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lye Val Lys Ser 
130 135 140 

CAA TTA AAG AAT AAT GCC AAA GAA ATC GGA AAT CGA TGT TTT GAG TTC 480 
Gin Leu Lys Asn Ash Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe 
145 150 155 160 

TAC CAC AAG TGT GAC AAT GAA TGC ATG GAA AGT GTA AGA AAT CGG ACT 528 
Tyr His Lys Cys Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr 
165 170 175 

TAT GAT TAT CCC AAA TAT TCA GAA GAG TCA AAC TTG AAC AGG GAA AAG 576 
Tyr Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys 
180 IBS 190 

GTA GAT GGA GTG AAA TTG GAA TCA ATG CGG ATC TAT CAG ATT CTG GCG 624 
Val Asp Gly Val Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala 
195 200 205 

ATC TAC TCA ACT GTC CCC AGT TCA CTG CTG CTT TTG GTC TCC CTG GGG 672 
He Tyr Ser Thr Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly 
210 215 220 
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CCA ATC AGT TTC TCG ATC TCT TCT AAT GGA TCT TTC CAG TGC AGA ATA 720 
Ala lie Ser Phe Trp Met Cye Ser Aon Gly Ser Leu Gin Cye Arg lie 
225 230 235 240 

TGC ATC TGA 729 
Cys lie 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 242 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Met Aep Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cye Phe Leu Trp 
1 5 10 15 

His Val Arg Lye Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 ^25 30 

Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 

Thr Leu Gly Leu Asp lie Glu Thr Ala Thr Arg Ala Gly Lys Gin lie 
50 55 60 

Val Glu Arg lie Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 60 

Met Gin lie Pro Ala Val Gly Lys Glu Phe Asn Lye Leu Glu Lys Arg 
85 90 95 

Met Glu Asn Leu Asn Lys Lys Val Asp Asp Gly Phe Leu Asp lie Trp 
100 105 110 

Thr Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu 
115 120 125 

Asp Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser 
130 135 140 

Gin Leu Lys Asn Asn Ala Lys Glu lie Gly Asn Gly Cys Phe Glu Phe 
145 150 155 160 

Tyr His Lye Cys Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr 
165 170 175 

Tyr Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys 
180 185 190 

Val Asp Gly Val Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala 
195 200 205 

He Tyr Ser Thr Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly 
210 215 220 
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Ala lie Ser Phe Trp Met Cys Ser Aen Gly Ser Leu Gin Cys Arg lie 
225 230 235 240 

Cys lie 



(2) INFORMATION FOR SEQ ID NO: 19; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 810 base paire 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : double 

(D) TOPOLOGY : unknovm 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: I. .807 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA CAT TCC TTT CTT TGG 48 
Met Asp Pro Aen Thr Val Ser Ser Phe Gin Val Asp Cye Phe Leu Trp 
1 5 10 15 

CAT GTC CGC* AAA CGA CTT GCA GAC CAA GAA CTA GGT CAT CCC CCA TTC 96 
Hie Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC ATG GAT CTG TCC AGA GGT 144 
Leu Asp Arg Leu Arg Arg Asp Glh Lys Ser Met Asp Leu Ser Arg Gly 
35 40 45 

CTA TTT GGA GCC ATT GCC GGT TTT ATT GAA GGG GGA TGG ACT CGA ATG 192 
Leu Phe Gly Ala He Ala Gly Phe He Glu Gly Gly Trp Thr Gly Met 
50 55 60 

ATA GAT CGA TGG TAC GGT TAT CAT CAT CAG AAT GAA CAG GGA TCA GGC 240 
He Asp Gly Trp Tyr Gly Tyr His His Gin Asn Glu Gin Gly Ser Gly 
65 70 75 80 

TAT GCA CCG GAT CAA AAA AGC ACA CAA AAT CCC ATT AAC GGG ATT ACA 288 
Tyr Ala Ala Asp Gin Lys Ser Thr Gin Asn Ala He Asn Gly He Thr 
85 90 . 95 

AAC AAC CTG AAC TCT GTT ATC GAG AAA ATG AAC ATT CAA TTC ACA CCT 336 
Asn Lys Val Asn Ser Val He Glu Lys Met Asn He Gin Phe Thr Ala 
100 105 110 

GTG GGT AAA GAA TTC AAC AAA TTA GAA AAA AGG ATG GAA AAT TTA AAT 384 
Val Gly Lys Glu Phe Asn Lys Leu Glu Lys Arg Met Glu Asn Leu Asn 
115 120 125 

AAA AAA GTT GAT GAT GGA TTT CTG GAC ATT TGG ACA TAT AAT GCA GAA 432 
Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp Thr Tyr Asn Ala Glu 
130 135 140 

TTG TTA GTT CTA CTG GAA AAT GAA AGG ACT CTG GAT TTC CAT GAC TCA 480 
Leu Leu Val Leu Leu Glu Asn . Glu Arg Thr Leu Asp Phe His Asp Ser 
145 150 155 160 
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AAT GTG AAG AAT CTG TAT GAG AAA CTA AAA AGC CAA TTA AAG AAT AAT 528 
Aon Val Lye Asn Leu Tyr Glu Lye Val Lye Ser Gin Leu Lye Ann Aon 
165 170 175 

CCC AAA GAA ATC GGA AAT GGA TGT TTT GAG TTC TAC CAC AAG TCT GAC 576 
Ala Lye Glu lie Gly Aen Gly Cye Phe Glu Phe Tyr His Lye Cye Asp 
180 185 190 

AAT GAA TGC ATG GAA AGT GTA AGA AAT GGG ACT TAT GAT TAT CCC AAA 624 
Aen Glu Cys Met Glu Ser Val Arg Aen Gly Thr Tyr Asp Tyr Pro Lye 
195 200 205 

TAT TCA GAA GAG TCA AAG TTG AAC AGG GAA AAG GTA GAT GGA GTG AAA 672 
Tyr Ser Glu Glu Ser Lye Leu Aen Arg Glu Lye Val Aep Gly Val Lye 
210 215 220 

TTG GAA TCA ATG GGG ATC TAT CAG ATT CTG GCG ATC TAC TCA ACT CTC 720 
Leu Glu Ser Met Gly He Tyr Gin He Leu Ala He Tyr Ser Thr Val 
225 230 235 240 

GCC AGT TCA CTG GTG CTT TTG GTC TCC CTG GGG GCA ATC ACT TTC TGG 768 
Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly Ala He Ser Phe Trp 
245 250 255 

ATG TGT TCT AAT GGA TCT TTG CAG TGC AGA ATA TGC ATC TGA 810 
Met Cye Ser Aen Gly Ser Leu Gin Cys Arg He Cys He 
260 265 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 269 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Met Asp Pro Aen Thr Val Ser Ser Phe Gin Val Asp Cye Phe Leu Trp 

1 - 5 10 15 

His Val Arg Lye Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

Leu Asp Arg Leu Arg Arg Aep Gin Lye Ser Met Aep Leu Ser Arg Gly 
35 40 45 

Leu Phe Gly Ala He Ala Gly Phe He Glu Gly Gly Trp Thr Gly Met 
50 55 60 

lie Aep Gly Trp Tyr Gly Tyr His Hie Gin Aen Glu Gin Gly Ser Gly 
65 70 75 80 

Tyr Ala Ala Aep Gin Lye Ser Thr Gin Aen Ala He Aen Gly He Thr 
85 90 95 

Aen Lye Val Aen Ser Val lie Glu Lye Met Asn He Gin Phe Thr Ala 
100 105 110 
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Val Cly Lys Glu Phe Asn Lye Leu Glu Lys Arg Met Glu Asn Leu Asn 
115 120 125 

Lye Lys Val Asp Asp Gly Phe Leu Aep lie Trp Thr Tyr Aen Ala Glu 
130 135 140 

Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu Asp Phe His Asp Ser 
145 150 155 160 

Asn Val Lye Asn Leu Tyr Glu Lys Val Lys Ser Gin Leu Lys Asn Asn 
165 170 175 

Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe Tyr His Lys Cys Asp 
180 185 190 

Aen Glu Cys Met Glu Ser Val Arg Aen Gly Thr Tyr Asp Tyr Pro Lys 
195 200 205 

Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys Val Asp Gly Val Lys 
210 215 220 

Leu Glu Ser Met Gly lie Tyr Gin He Leu Ala He Tyr Ser Thr Val 
225 230 235 240 

Ala Ser Ser Leu Val Leu Leu Val Ser Leu Cly Ala He Ser Phe Trp 
245 250 255 

Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He Cys He 
260 265 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 630 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

<ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: l.*627 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA GAT TGC TTT CTT TGG 48 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
1 5 10 15 

CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA GGT CAT CCC CCA TTC 96 
His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

CTT GAT CGC CTT CGC CGA GAT CAG AAA TCC ATG GAT CAT ATG TTA ACA 144 
Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Met Asp His Met Leu Thr 
35 40 45 

AGT ACT CGA TCT GTG GGT AAA GAA TTC AAC AAA TTA GAA AAA AGG ATG 192 
Ser Thr Arg Ser Val Gly Lys Glu Phe Asn Lys Leu Glu Lys Arg Met 
50 55 60 
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GAA AAT TTA AAT AAA AAA CTT CAT CAT GGA TTT CTC CAC ATT TCC ACA 240 

Glu Asn Leu Asn Lye Lys Val Asp Asp Cly Phe Leu Asp lie Trp Thr 
65 70 75 80 

TAT AAT CCA CAA TTC TTA CTT CTA CTC CAA AAT CAA AGG ACT CTC GAT 28B 

Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu Asp 
85 90 95 

TTC CAT GAC TCA AAT GTG AAC AAT CTC TAT GAG AAA CTA AAA ACC CAA 336 

Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser Gin 
100 10S no 

TTA AAG AAT AAT GCC AAA GAA ATC GGA AAT GGA TGT TTT GAG TTC TAC 384 

Leu Lys Asn Asn Ala Lys Glu lie Gly Asn Cly Cys Phe Glu Phe Tyr 
115 120 125 

CAC AAG TGT GAC AAT GAA TGC ATG GAA ACT GTA AGA AAT GGC ACT TAT 432 

His Lys Cys Asp Asn Glu Cys Met Glu Ser Val Arg Aan Gly Thr Tyr 

130 135 140 

CAT TAT CCC AAA TAT TCA GAA GAG TCA AAC TTC AAC AGG GAA AAG GTA 480 

Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys Val 
145 150 155 160 

GAT GGA GTG AAA TTC GAA TCA ATC GGG ATC TAT CAC ATT CTC CCC ATC 528 

Asp Gly Val Lys Leu Glu Ser Met Gly lie Tyr Gin lie Leu Ala lie 
165 ~ 170 175 

TAC TCA ACT CTC GCC ACT TCA CTC GTG CTT TTC CTC TCC CTG GGG CCA 576 

Tyr Ser Thr Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Cly Ala 
180 185 190 

ATC ACT TTC TCC ATC TGT TCT AAT GGA TCT TTC CAC TCC AGA ATA TCC 624 

He Ser Phe Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arc He Cys 
195 200 205 

ATC TCA 630 
He 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 209 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
15 10 15 

His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

Leu Asp Arg Leu Arg Arg Asp Gin Lye Ser Met Asp His Met Leu Thr 
35 40 45 

Ser Thr Arg Ser Val Gly Lys Glu Phe Asn Lys Leu Glu Lys Arg Met 
50 55 60 
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Glu Asn Leu Asn Lye Lys Val Asp Asp Cly Phe Leu Asp lie Trp Thr 
65 70 75 80 

Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Aen Glu Arg Thr Leu Asp 
85 90 95 

Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser Gin 
100 105 110 

Leu Lys Asn Asn Ala Lys Glu lie Gly Asn Gly CyB Phe Glu Phe Tyr 
115 120 125 

His Lys CyB Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr Tyr 
130 135 140 

Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys Val 
145 150 155 160 

Asp Gly Val Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala lie 
165 170 175 

Tyr Ser Thr Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly Ala 
180 185 190 

He Ser Phe Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He Cys 
195 200 205 

He 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 717 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..714 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

ATG GAT CCA AAC ACT GTG TCA AGO TTT CAG GTA GAT TGC TTT CTT TGG 48 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
1 5 10 15 

CAT GTC CGC AAA CGA GTT CCA GAC CAA GAA CTA GGT GAT GCC CCA TTC 96 
His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC CTA AG A GG A AGG GGC AGC 144 
Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 

ACT CTT GGT CTG GAC ATC GAG ACA GCC ACA CGT GCT GGA AAG CAG ATA 192 
Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
50 55 60 
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CTC GAG CGG ATT CTG AAA GAA GAA TCC CAT GAG GCA CTT AAA ATG ACC 240 
^Val Glu Arg lie Leu Lye Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 80 

ATG CAG ATC CCG GAA TTC AAC AAA TTA GAA AAA AGG ATG GAA AAT TTA 288 
Met Gin He Pro Glu Phe Asn Lye Leu Glu Lys Arg Met Glu Asn Leu 
85 90 95 

AAT AAA AAA GTT GAT GAT GGA TTT CTG GAC ATT TGG ACA TAT AAT GCA 336 
Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp Thr Tyr Asn Ala 
100 105 110 

GAA TTG TTA GTT CTA CTG GAA AAT GAA AGG ACT CTG GAT TTC CAT GAC 384 
Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu Asp Phe His Asp 
US 120 125 

TCA AAT GTG AAG AAT CTG TAT GAG AAA GTA AAA AGC CAA TTA AAG AAT 432 
Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser Gin Leu Lys Asn 
130 135 140 

AAT GCC AAA GAA ATC GGA AAT GGA TGT TTT GAG TTC TAG CAC AAC TGT 480 
Asn Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe Tyr His Lys Cys 
145 150 155 * 160 

GAC AAT GAA TCC ATC GAA ACT GTA AGA AAT GGG ACT TAT CAT TAT CCC 528 
Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr Tyr Asp Tyr Pro 
165 170 175 

AAA TAT TCA GAA GAG TCA AAG TTC AAC AGG GAA AAG GTA GAT GGA GTG 576 
Lye Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys Val Asp Gly Val 
180 185 190 

AAA TTG GAA TCA ATG GGG ATC TAT CAG ATT CTG GCG ATC TAC TCA ACT 624 
Lys Leu Glu Ser Met Gly He Tyr Cln lie Leu Ala He Tyr Ser Thr 
195 200 205 

GTC GCC AGT TCA CTG GTG CTT TTG CTC TCC CTG GGG CCA ATC AGT TTC 672 
Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly Ala He Ser Phe 
210 215 220 

TGG ATG TGT TCT AAT GGA TCT TTG CAC TCC AGA ATA TGC ATC 714 
Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He Cys He 
225 230 235 

TGA 717 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 238 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
1 5 10 15 

His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 
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Leu Asp Arg Leu Arg Arg Asp Gin Lye Ser Leu Arg Gly Arg Gly Ser 
35 40 45 

Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lye Gin He 
50 55 60 

Val Glu Arg He Leu Lys Glu Glu Ser ABp Glu Ala Leu Lye Met Thr 

, 65 , . 70 75 80 

Met Gin He Pro Glu Phe Asn Lys Leu Glu Lye Arg Met Glu Aen Leu 
85 90 95 

Asn Lye Lys Val Asp Asp Gly Phe Leu Asp He Trp Thr Tyr Asn Ala 
100 105 HO 

Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu Asp Phe His Asp 
115 120 125 

Ser Asn Val Lye Asn Leu Tyr Glu Lys Val Lys Ser Gin Leu Lys Asn 
130 135 140 

Asn Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe Tyr His Lys Cys 
145 150 155 160 

Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr Tyr Asp Tyr Pro 
165 170 175 

Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys *Val Asp Gly Val 
180 185 190 

Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala He Tyr Ser Thr 
195 200 205 

Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly Ala He Ser Phe 
210 215 220 

Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He Cys He 
225 230 235 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 681 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
/B) LOCATION : -1. . 678 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA GAT TGC TTT CTT TGG 48 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Aep Cys Phe Leu Trp 
1 5 io 15 

CAT OTC CGC AAA CGA GTT GCA GAC CAA CAA CTA GCT GAT GCC CCA TTC 96 
His Val Arg Lys Arg Val Ala Asp Gin Clu Leu Gly Asp Ala Pro Phe 
20 25 30 

CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC CTA AGA GGA AGG GGC AGC 144 
Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 

ACT CTT GGT CTG GAC ATC GAG AGA GCC ACA CCT GCT GGA AAG CAG ATA 192 
Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
SO 55 60 

GTG GAG CGG ATT CTG AAA GAA CAA TCC CAT GAG GCA CTT AAA ATG ACC 240 
Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 80 

ATG CAG ATC CCG AAT AAA AAA GTT GAT GAT GGA TTT CTG GAC ATT TGG 288 
Met Gin He Pro Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp 
85 90 95 

ACA TAT AAT GCA GAA TTG TTA GTT CTA CTG GAA AAT GAA AGG ACT CTG 336 
Thr Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu 
100 105 no 

GAT TTC CAT GAC TCA AAT GTG AAG AAT CTG TAT GAG AAA GTA AAA AGC 384 
Asp Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser 
115 120 125 

CAA TTA AAG AAT AAT GCC AAA GAA ATC GGA AAT GGA TGT TTT GAG TTC 432 
Gin Leu Lys Asn Asn Ala Lys Glu He Gly Asn Gly Cys Phe Glu Phe 
130 135 140 

TAC CAC AAG TGT GAC AAT GAA TGC ATG GAA AGT GTA AGA AAT GGG ACT 480 
Tyr His Lys Cys Asp Asn Clu Cys Met Clu Ser Val Arg Asn Gly Thr 
1« ISO 155 160 

TAT GAT TAT CCC AAA TAT TCA GAA GAG TCA AAG TTC AAC AGG GAA AAG 528 
Tyr Asp Tyr Pro Lys Tyr Ser Glu Clu Ser Lys Leu Asn Arg Glu Lys 
165 170 175 

GTA GAT GGA GTG AAA TTG GAA TCA ATC GGG ATC TAT CAG ATT CTG GCG 576 
Val Asp Gly Val Lya Leu Glu Ser Met Gly He Tyr Gin He Leu Ala 
180 185 190 

ATC TAC TCA ACT GTC GCC ACT TCA CTG CTG CTT TTC CTC TCC CTG GGG 624 
lie Tyr Ser Thr Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly 
195 200 205 

GCA ATC AGT TTC TGG ATG TGT TCT AAT GGA TCT TTG CAG TGC AGA ATA 672 
Ala He Ser Phe Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He 
210 215 220 



TGC ATC TGA 
Cys He 
225 



681 
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(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 226 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION : SEQ ID N0:26: 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
1 5 10 15 

His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 

Thr Leu Gly Leu Asp lie Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
50 55 60 

Val Glu Arg lie Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 80 

Met Gin He Pro Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp 
85 90 95 

Thr Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu 
100 105 110 

Asp Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser 
115 120 125 

Gin Leu Lys Asn Asn Ala Lys Glu lie Gly Asn Gly Cys Phe Glu Phe 
130 135 140 

Tyr His Lys Cys Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr 
145 150 155 160 

Tyr Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys 
165 170 175 

Val Asp Gly Val Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala 
180 185 190 

He Tyr Ser Thr Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly 
195 200 205 

Ala lie Ser Phe Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He 
210 215 220 

Cys He 
225 
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(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 158 amino aci.de 

(B) TYPE: amino acid 
(D) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Met Aep Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
1 5 10 15 

His Vai Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

Leu Aep Arg Leu Arg Arg Asp Gin Lye Ser Leu Arg Gly Arg Gly Ser 
35 40 45 

Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
50 55 60 

- — ... 

Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
... 65 70 75 B0 

Met Gin lie Pro Val Glu Ser Val Arg Aen Gly Thr Tyr Asp Tyr Pro 
65 90 95 

Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys Val Asp Gly Val 
100 105 110 

Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala He Tyr Ser Thr 
115 120 125 

Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly Ala He Ser Phe 
. J>30 135 140 

Trp "Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He Cys He 
145 150 155 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 
. JA) LENGTH: 163 amino acids 
~(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Met Asp Pro Aen Thr Val Ser Ser Phe Gin Val Asp Cye Phe Leu Trp 
1 5 1° 15 

His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 
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Thr Leu Cly Leu Asp He Glu Thr Ala Thr Arg Ala Cly Lys Gin He 
50 55 60 

Val Glu Arg He Leu Lys Clu Glu Ser Asp Glu Ala Leu Lye Met Thr 
65 70 75 80 

Met Asp Leu Ser Arg Gly Leu Phe Gly Ala He Ala Gly Phe He Glu 
85 90 95 

Cly Gly Trp Thr Gly Met He Asp Gly Trp Tyr Gly Tyr Hie His Gin 
100 105 no 

Asn Glu Gin Gly Ser Gly Tyr Ala Ala Asp Gin Lys Ser Thr Gin Asn 
115 120 125 

Ala lie ABn Gly He Thr Asn Lye Val Aon Ser Val He Glu Lys Met 
130 135 140 

Asn He Gin Phe Thr Ala Val Gly Lys Clu Phe Ser Cys Leu Thr Ala 
145 150 155 160 

Tyr His Arg 



(2) INFORMATION FOR SEQ ID NO: 29: ♦ 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 231 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
1 5 10 is 

His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

teu As P Ar 9 Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 

35 40 45 

Thr Leu Gly Leu Asp lie Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
50 55 60 

Val Glu Arg lie Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 80 

Met Gin lie Pro Ala Val Cly Lys Clu Phe Asn Lys Leu Glu Lys Arg 
85 90 95 

Met Glu Asn Leu Asn Lys Lys Val Asp Asp Gly Phe Leu Asp He Trp 
100 105 no 

Thr Tyr Asn Ala Clu Leu Leu Val Leu Leu Clu Asn Glu Arg Thr Leu 
115 120 125 

Asp Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser 
130 135 140 
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Gin Leu Lye Aen Aen Ala Lye Glu lie Gly Aon Gly Cye Phe Glu Phe 

150 155 160 

Tyr Hie Lye Cye Asp Aen Glu Cys Met Glu Ser Val Arg Aen Gly Thr 
165 170 175 

Tyr Aep Tyr Pro Lye Tyr Ser Glu Glu Ser Lye Leu Aen Arg Glu Lye 
180 185 190 

Val Aep Gly Val Lye Leu Glu Ser Met Gly He Tyr Gin He Leu Ala 
195 200 205 

He Tyr Ser Thr Val Ala Ser Ser Gly Gly Ser Tyr Ser Met Glu Hie 
210 215 220 

Phe Arg Trp Gly Lye Pro Val 
225 230 



(2). INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 225 amino acide 

(B) TYPE: amino acid 
"* (D) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Met Aep Pro Aen Thr Val Ser Ser Phe Gin Val Aep Cye Phe Leu Trp 
1 5 10 15 

Hie Val Arg Lye Arg Val Ala Aep Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

Leu Aep Arg Leu Arg Arg Aep Gin Lye Ser Leu Arg Gly Arg Gly Ser 
35 40 45 

Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lye Gin He 
50 55 60 

Val Glu Arg He Leu Lye Glu Glu Ser Aep Glu Ala Leu Lye Met Thr 
65 70 75 80 

Met Gin He Pro Ala Val Gly Lye Glu Phe Aen Lye Leu Glu Lye Arg 
85 90 95 

Met Glu Aen Leu Aen Lye Lye Val Aep Asp Gly Phe Leu Asp He Trp 
100 105 no 

Thr Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Aen Glu Arg Thr Leu 
115 120 125 

Asp Phe His Asp Ser Aen Val Lys Asn Leu Tyr Glu Lye Val Lye Ser 
130 135 140 

Gin Leu Lye Aen Aen Ala Lye Glu He Gly Aen Gly Cys Phe Glu Phe 
"5 150 155 160 

Tyr Hie Lys Cys Asp Asn Glu Cys Met Glu Ser Val Arg Aen Gly Thr 
165 170 175 
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Tyr Asp Tyr Pro Lye Tyr Ser Clu Glu Ser Lys Leu Asn Arg Glu Lys 
180 185 190 

Val Asp Gly Val Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala 
195 200 205 

He Tyr Ser Thr Val Ala Ser Ser Gly Gly Ser Tyr Ser Met Leu Val 
210 215 220 

Asn 
225 



(2) INFORMATION FOR SEQ ID NO:31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 912 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: 1..912 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: 

ATG GAT CCA AAC ACT GTG TCA AGC TTT CAG GTA GAT TGC TTT CTT TGG 48 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
1 5 10 15 

CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CIA GGT GAT GCC CCA TTC 96 
His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC CTA AGA GGA AGG GGC AGC 144 
Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
35 40 45 

ACT CTT GGT CTG GAC ATC GAG ACA GCC ACA CGT GCT GGA AAG CAG ATA 192 
Thr Leu Gly Leu Asp He Glu Thr Ala Thr Arg Ala Gly Lys Gin He 
SO 55 60 

GTG GAG CGG ATT CTG AAA GAA GAA TCC GAT GAG GCA CTT AAA ATG ACC 240 
Val Glu Arg He Leu Lys Glu Glu Ser Asp Glu Ala Leu Lys Met Thr 
65 70 75 80 

ATG CAG ATC CCG GCT CTA TTT GGA GCC ATT GCC GGT TTT ATT GAA GGG 288 
Met Gin He Pro Gly Leu Phe Gly Ala He Ala Gly Phe He Glu Gly 
85 90 95 

GGA TGG ACT GGA ATG ATA GAT GGA TGG TAC GGT TAT CAT CAT CAG AAT 336 
Gly Trp Thr Gly Met He Asp Gly Trp Tyr Gly Tyr His His Gin Asn 
100 105 110 

GAA CAG GGA TCA GGC TAT GCA GCG GAT CAA AAA AGC ACA CAA AAT GCC 364 
Glu Gin Gly Ser Gly Tyr Ala Ala Asp Gin Lys Ser Thr Gin Asn Ala 
115 120 125 



76 



WO 93/15763 



PCT/US93/01451 



ATT AAC CGC ATT ACA AAC AAC CTG AAC TCT CTT ATC GAG AAA ATC AAC 432 
lie Asn Gly lie Thr Aen Lya Val Asn Ser Val lie Glu Lys Met Aen 
130 135 140 

ATT CAA TTC ACA GCT GTG GGT AAA GAA TTC AAC AAA TTA GAA AAA AGG 480 
lie Gin Phe Thr Ala Val Gly Lys Glu Phe Asn Lye Leu Glu Lys Arg 
145 150 155 160 

ATG GAA AAT TTA AAT AAA AAA GTT GAT GAT GGA TTT CTG GAC ATT TGG 528 
Met Glu Asn Leu Asn Lys Lye Val Aap Asp Gly Phe Leu Asp He Trp 
165 170 175 

ACA TAT AAT GGA GAA TTG TTA GTT CTA CTG GAA AAT CAA AGG ACT CTG 576 
Thr Tyr Aen Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu 
180 185 190 

GAT TTC CAT GAC TCA AAT GTG AAG AAT CTG TAT GAG AAA CTA AAA AGC 624 
Asp Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser 
195 200 205 

CAA TTA AAG AAT AAT GCC AAA GAA ATC GGA AAT GGA TCT TTT GAG TTC 672 
Gin Leu Lys Asn Asn Ala Lys Glu He Gly Asn Gly CyB Phe Glu Phe 
210 215 220 

TAC CAC AAC TCT GAC AAT GAA TCC ATG GAA ACT GTA ACA AAT CGC ACT 720 
Tyr His Lys Cys Asp Asn Glu Cye Met Glu Ser Val £rg Asn Gly Thr 
225 230 235 240 

TAT GAT TAT CCC AAA TAT TCA CAA GAG TCA AAG TTG AAC AGG GAA AAG 768 
Tyr Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys 
245 250 255 

GTA GAT GGA GTG AAA TTG GAA TCA ATC GGG ATC TAT CAC ATT CTG GCG 816 
Val Asp Gly Val Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala 
260 265 270 

ATC TAC TCA ACT GTC CCC ACT TCA CTG CTG CTT TTC CTC TCC CTG GCG 864 
He Tyr Ser Thr Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly 
275 280 285 

GCA ATC ACT TTC TGG ATG TGT TCT AAT GGA TCT TTC CAC TCC AGA ATA 912 
Ala He Ser Phe Trp Met Cys Ser Asn Gly Ser Leu Gin Cys Arg He 
290 295 300 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 304 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32:' 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 
. 1 S 10 15 

His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 
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Leu Asp Arg I*u Arg Arg Asp Gin Lys Ser Leu Arg Cly Arg Gly Ser 

40 45 

Thr Leu Gly Leu Asp lie Glu Thr Ala Thr Arg Ala Gly Lye Gin lie 
?° 55 60 

Val Glu Arg lie Leu Lye Glu Glu Ser Asp Glu Ala Leu Lye Met Thr 
65 70 75 so 

Met Gin He Pro Gly Leu Phe Gly Ala He Ala Gly Phe He Glu Gly 
85 90 95 

Gly Trp Thr Gly Met He Asp Cly Trp Tyr Gly Tyr Hie His Gin Asn 
10° los no 

Glu Gin Gly Ser Gly Tyr Ala Ala Aep Gin Lys Ser Thr Gin Aen Ala 
115 120 125 

He Asn Gly He Thr Asn Lys Val Asn Ser Val He Glu Lys Met Asn 
"0 135 140 

lie Gin Phe Thr Ala Val Gly LyB Glu Phe Asn Lye Leu Glu Lys Arg 
145 "0 155 160 

Met Glu Aen Leu Asn Lys Lys Val Asp Asp Cly Phe Leu Asp He Trp 
165_ 170 175 

Thr Tyr Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Glu Arg Thr Leu 
180 185 190 

Asp Phe His Asp Ser Asn Val Lys Asn Leu Tyr Glu Lys Val Lys Ser 
195 200 205 

Gin Leu Lys Asn Asn Ala Lys Glu He Gly Asn Cly Cys Phe Glu Phe 
210 215 220 

Tyr His Lys Cys Asp Asn Glu Cys Met Glu Ser Val Arg Asn Gly Thr 
225 230 235 240 

Tyr Asp Tyr Pro Lys Tyr Ser Glu Glu Ser Lys Leu Asn Arg Glu Lys 
245 250 255 

Val Asp Gly Val Lys Leu Glu Ser Met Gly He Tyr Gin He Leu Ala 
260 265 270 

lie Tyr Ser Thr Val Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly 
275 280 285 

Ala He Ser Phe Trp Met Cys Ser Asn Gly Ser Leu Cln Cys Arg lie 
290 295 300 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 474 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
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( ix ) FEATURE ; 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..471 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

GTG CCT AAA GAA TTC AAC AAA TTA GAA AAA AGG ATG GAA AAT TTA AAT 48 
Val Gly Lye Glu Phe Asn Lye Leu Glu Lys Arg Met Glu Asn Leu Aen 
1 5 io is 

AAA AAA GTT GAT GAT GGA TTT CTC GAC ATT TGG ACA TAT AAT GCA GAA 96 
Lye Lye Val Aep Aep Gly Phe Leu Asp He Trp Thr Tyr Aen Ala Glu 
20 25 30 

TTG TTA GTT CTA CTG GAA AAT GAA AGG ACT CTG GAT TTC CAT GAC TCA 144 
Leu Leu Val Leu Leu Glu Aen Glu Arg Thr Leu Aep Phe Hie Aep Ser 
35 40 45 

AAT CTG AAG AAT CTG TAT GAG AAA GTA AAA AGC CAA TTA AAG AAT AAT 192 
Aen Val Lye Aen Leu Tyr Glu Lys Val Lye Ser Gin Leu Lye Aen Asn 
SO 55 60 

CCC AAA GAA ATC GGA AAT GGA TGT TTT GAG TTC TAC CAC AAG TGT GAC 240 
Ala Lye Glu He Gly Aen Gly Cye Phe Glu Phe Tyr Hie Lye Cye Aep 
65 70 75 80 

AAT GAA TGC ATG GAA AGT GTA AGA AAT GGG ACT TAT GAT TAT CCC AAA 288 
Aen Glu Cye Met Glu Ser Val Arg Aen Gly Thr Tyr Aep Tyr Pro Lye 
85 90 95 

TAT TCA GAA GAG TCA AAG TTG AAC AGG GAA AAG GTA GAT CCA CTG AAA 336 
Tyr Ser Glu Glu Ser LyB Leu Aen Arg Glu Lye Val Aep Gly Val Lye 
100 105 HO 

TTG GAA TCA ATG GGG ATC TAT CAG ATT CTG GCG ATC TAC TCA ACT CTC 384 
Leu Glu Ser Met Gly He Tyr Gin He Leu Ala He Tyr Ser Thr Val 
115 120 125 

CCC AGT TCA CTG CTC CTT TTG CTC TCC CTG _GGG GCA ATC ACT TTC TGC 432 
Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly Ala He Ser Phe Trp 
130 135 140 

ATG TGT TCT AAT GGA TCT TTG CAC TGC AGA ATA TGC ATC TCA 474 
Met Cye Ser Aen Gly Ser Leu Gin Cye Arg He Cye He 
145 150 155 



(2) INFORMATION FOR SEQ ID NO; 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 157 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Val Gly Lye Glu Phe Asn Lye Leu Glu Lys Arg Met Glu Aen Leu Aen 
1 5 10 15 
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Lye Lye Val Asp Asp Gly Phe Leu Asp lie Trp Thr Tyr Aon Ala Glu 

20 25 30 

Leu Leu Val Leu Leu Glu Aen Glu Arg Thr Leu Asp Phe His Asp Ser 

35 40 45 

Aen Val Lys Asn Leu Tyr Glu Lys Val Lys ser Gin Leu Lye Asn Asn 

50 55 60 



Ala Lye 
65 

Aen Glu 



Glu lie Gly Aen Gly CyB Phe Glu Phe Tyr His Lye Cys Asp 
70 75 80 

CyB Met Glu Ser Val Arg Asn Gly Thr Tyr Aep Tyr Pro Lys 
85 90 95 



Tyr Ser Glu Glu Ser LyB Leu Asn Arg Glu Lye Val Asp Gly Val Lye 
100 105 110 

Leu Glu Ser Met Gly lie Tyr Gin He Leu Ala He Tyr Ser Thr Val 
115 120 ■ 125 

Ala Ser Ser Leu Val Leu Leu Val Ser Leu Gly Ala lie Ser Phe Trp 
130 135 140 

Met Cys Ser Asn Gly Ser Leu Gin Cye Arg He Cys He 
145 150 155 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 
CATGGATCAT ATGTTAACAG ATATCAAGGC CTGACTGACT GAGACCT 47 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pair e 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
CTAGTATACA ATTGTCTATA GTTCCGGACT GACTGACTC 39 
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(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: 
CATGGGCGCC CATATGGGCA TATTCGGCG 



(2) INFORMATION FOR SEQ ID NO:38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
CCGCCGGTAT ACCCGTATAA CCC 



<2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
CATGCATCAT ATGTTAACAA GTACTCGATA TCAATGAGTG ACTGAAGCT 



(2 ) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: ftingle 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
CTAGTATACA ATTGTTCATG AGCTATACTT ACTCACTGAC T 
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(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

( C) STRANDED NESS : single 

(D) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: 
AATTCGTACC TA 



(2) INFORMATION FOR SEQ ID NO: 42: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: 
GCATGGATCT AG 
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WHAT IS CLAIMED IS: 

1. A vaccine for stimulating protection in 
animals against infection by influenza virus which 
comprises a an effective amount of an immunogenic 
fragment of the HA2 subunit of an HA protein selected 
from the group consisting of a type A subtype influenza 
virus or a type B influenza virus. 

2. The vaccine according to claim 1 wherein 
said type A subunit is H3N2. 

3. The vaccine according to claim 1 wherein 
the polypeptide is fused to a second polypeptide. 

4. The vaccine according to claim 2 wherein 
the second polypeptide comprises the N terminal amino 
acids of a NS1 protein. 

5. The vaccine according to claim 1 wherein 
the immunogenic fragment of the HA2 subunit is selected 
from the group consisting of a peptide comprising amino 
acids 1 to 221 of the H3HA2 subtype , a peptide comprising 
amino acids 77 to 221 of the H3HA2 subtype, a peptide 
comprising amino acids 1 to 223 of the BHA2 type, and a 
peptide comprising amino acids 41 to 223 of the BHA2 
type. 
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6. The vaccine according to claim 5 
comprising NS1 (I . B1) H3HA2 (I . 221) SEQ ID NO: 10. 

7. The vaccine according to claim 5 
comprising NS1 (MI) H3HA2 (77 . 22I) SEQ ID NO: 12. 

8. The vaccine according to claim 5 
comprising NS1 M2 BLHA2 41 . 223 SEQ ID NO: 14. 

9i A protein comprising an immunogenic 
fragment of the HA2 subunit of an HA protein selected 
from the group consisting of Type A subtype or type B 
influenza virus. 

10. The protein according to claim 9 wherein 
said type A subtype is H3N2. 

11. The protein according to claim 9 wherein 
the peptide containing the immunogenic fragment is fused 
to a second peptide or protein. 

12. The protein according to claim 10 wherein 
the second peptide comprises the N terminal amino acids 
of a NS1 protein. 
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13. The protein according to claim 10 wherein 
the immunogenic fragment of the HA2 subunit is selected 
from the group consisting of a peptide comprising amino 
acids 1 to 221 of the H3HA2 subunit, a peptide comprising 
amino acids 77 to 221 of the H3HA2 subunit, a peptide 
comprising amino acids 1-223 of the BHA2 subunit, and a 
peptide comprising amino acids 41-223 of the BHA2 
subunit* 

14. A polypeptide NSl (141) H3HA2 (1 . 22l) SEQ ID NO: 10. 

15. A polypeptide NSl^MHAi^,, SEQ ID NO: 

12. 

16. A polypeptide NSl MI BLHA2 4l . n3 SEQ ID NO: 14. 

17. A DNA molecule comprising a coding 
sequence for an immunogenic fragment of the HA2 subunit 
of an HA protein selected from the group consisting of a 
Type A subtype or type B influenza virus. 

IB. The DNA molecule according to claim 17 
wherein said Type A subunit is H3N2. 
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19. The DNA molecule according to claim 17 
comprising a coding sequence for the polypeptide N51 ( ,_ 
8, ) H3HA2 (1 . 221) SEQ ID NO: 10. 

20. The DNA molecule according to claim 17 
comprising a coding sequence for the polypeptide NS1 (1 _ 
42) H3BLHA2 (41 . 223) SEQ ID NO: 14. 

21. The DNA molecule according to claim 17 
comprising a coding sequence for the polypeptide NS1 C1 . 
S1) H3HA2 C77 . 221) SEQ ID NO: 12. 

22. Plasmid pMG13H3HA SEQ ID NO: 9. 

23. Plasmid pNSl M ,BLHA2 41 . 223 SEQ ID NO: 13. 

24. A microorganism transformed with a DNA 
molecule comprising a coding sequence for an immunogenic 
fragment of the HA2 subunit of an HA protein selected 
from the group consisting of a Type A subtype or type B 
influenza virus. 

25. The microorganism according to claim 24 
wherein said Type A subunit is H3N2. 
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26. The microorganism according to claim 24 
wherein said DNA molecule comprises a coding sequence for 
the polypeptide NS1 (1 ^ 1) H3HA2 (1 . 2I1) SEQ ID NO: 10. 

27. A combination vaccine for stimulating 
protection in animals against infection by influenza 
virus which comprises a first polypeptide having an 
immunogenic fragment of the HA2 subunit of an influenza 
H3 subtype virus and a second polypeptide selected from 
the group consisting of a polypeptide having an 
immunogenic fragment of the HA2 subunit of a type B 
influenza virus, and a polypeptide having an immunogenic 
fragment of the HA2 subunit of an HI subtype influenza 
virus, and a polypeptide having an immunogenic fragment 
of the HA2 subunit of an H2 subtype influenza virus. 

28. The combination vaccine according to claim 
27 wherein the first polypeptide is selected from the 
group consisting of NSl (1 ^ 1) H3HA2 (l . 22l) SEQ ID NO: 10 and NS1 0 . 
81) H3HA2 (77 . 2 j l) SEQ ID NO: 12. 

29. The combination vaccine according to claim 
27 wherein the second polypeptide is a polypeptide having 
an immunogenic fragment of the HA2 subunit of an HI 
subtype influenza virus. 
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30. The combination vaccine according to claim 
27 wherein said second polypeptide is selected from the 
group consisting of C13 SEQ ID NO: 16, D SEQ ID NO: 18, 
C13 short SEQ ID NO: 20, D short SEQ ID NO: 22, A SEQ ID 
NO: 24, C SEQ ID NO: 26, AD SEQ. ID NO: 27, A13 SEQ ID NO: 
28, M SEQ ID NO: 29, AM SEQ ID NO: 30, AM+ SEQ ID NO: 32, 
and H1HA2 46 . 222 SEQ ID NO: 34. 

31. The combination vaccine according to claim 
27 wherein said second polypeptide is NS1 MJ BLHA2 4 ,.223 SEQ ID 
NO: 14. 

32 • A combination vaccine for stimulating 
protection in animals against infection by influenza 
virus which comprises a first polypeptide having an 
immunogenic fragment of the HA2 subunit of an influenza 
H3 subtype virus, a second polypeptide having an 
immunogenic fragment of the HA2 subunit of an influenza B 
type virus, and a third polypeptide selected from the 
group consisting of a polypeptide having an immunogenic 
fragment of the HA2 subunit of an HI subtype influenza 
virus and a polypeptide having an immunogenic fragment of 
the HA2 subunit of an H2 subtype influenza virus . 



88 



WO 93/15763 



PCT/US93/01451 



FIGURE 1 

(a) 

(c) — tc 1 a— c—t— c t~t ggg~a act 

(d) GGCATATTCC GCCCAATACC AGGTTTCATA GAAAATGGTT GGGACCGAAT 50 



(a) 
(b) 



c) t~a atcat g gaac at ct 

d) GATAGACGGT TGGTACGCTT TCAGGCATCA AAATTC-GAG CCCACACGAC 100 



(c) -t g aa a aat ta — gg g — t-caaac 

(d) AAGCAGCAGA TCTTAAAAGC ACTCAAGCAG CCATCGACCA AATCAATGGC 150 

( a ) 
(b) 

[c) — gg — — ct ct — t a-t attc a cagetg-g-g 

(d) AAACTGAATA GGGTAATCGA GAAGACGAAC GAGAAATTCC ATCAAATCCA 200 

l S 1 t — a aaca — t aaa — 9 — gg-aa-tt-a a-t a-a- 

(d) AAAGGAATTC TCAGAAGTAG AAGCGAGAAT TCAGGACCTC GAGAAATACG 250 

[a 
(b. 

(cj — t — tgg atttc-g — c a-t a-a 1 a — at-gt-a — t 

d) TTGAAGACAC TAAAATAGAT CTCTGGTCTT ACAATGCGGA GCTTCTTGTC 300 



eta— a tg— agg — tc-g— t-c ca— — — aa -tg g — 




(a) 
(b) 
(c) 

d) GCTCTGGAGA ACCAACATAC AATTCATCTG ACTGACTCGG AAATGAACAA 350 

(a) 
(b) 

[c) t a g gt— aa- -c t-a-a -a-t c a-a— a— c- 

(d) ACTGTTTGAA AAAACAAGGA GGCAACTGAG GGAAAATGCT GAGGACATGG 400 

(a) 
(b) 

d) GCAATGGTTG CTTCAAAATA TACCACAAAT GTGACAATGC TXGCATAGCG 450 
(a) 

(b) 

(c) agtg-a ~ tt — ccc aa ttc a — gt — aa 

(d) TCAATCAGAA ATGGGACTTA TGACCATGAT CTATACAGAG ACGAAGCATT 500 





[a) 
(b) 

[c) gttg a— gaaa— g-ag -t— a— ga- -t— g-a atgggg-tct 

(d) AAACAACCGG TTTCAGATCA AAGGTGTTCA ACTGAAGTCA GGATACAAAG 550 

[a) 
(b) 

[c) -tea 1 gc c-a caa-tg-cg -ca-t-cac g-gct-t-g 

(d) ACTGGATCCT GTGCATTTCC TTTCCCATAT CATGCTTTTT GCTTTGTGTT 600 
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FIGURE 1 (con't) 
(c) — c-ccH ~~ • a- 



^SSc^ £8ES* ^£33 



650 



< = } »a— » c g 

(d) GTCCAACATT TGCATTTGA- 

670 
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FIGURE 2 
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05 90 



95 



288 



SJ S SS Si ^ S 2S ?" TAC 001 " c ACC 

rp ciu ciy Met He Asp ciy Trp Tyr Cly Phe Arg His Gin 

105 110 

= = 55S5SBB5E55SBa 

135 140 

!K St Si S Si E 2* S c TCA GAA CT * CAA 

145 ICQ C1U LyS Glu Phe Ser 61« Val Glu Gly 

155 160 

a s 2 2; 2 s ^ 22 a 2 ^ °» m 

16S w ** y " ^Y* val Clu Asp Thr Lys He Asp Leu 

170 175 

sssassaasssEBsas 
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FIGURE 2 (con't) 

720 

240 



S £* f C f T TGC ATA GGG TCA ATC A =A AAT GGG 

lie Tyr Ha . a Lys Cys Asp Asn Ala Cys He Gly Ser He Arg Asn Gly 

230 235 24 J 

S 2S SI ?f ft f c AGA GAC GAA GCA TTA ^ *M c <* m 

Thr Tyr Asp His Asp Val Tyr Arg Asp Glu Ala Leu Asn Asn Arg Phe 
245 250 



255 



768 



816 



864 



T?f S? T GTT GAA CTG AAG TCA CGA TAC AAA GAC TGG ATC CTG 

Gin lie Lys Gly Val Glu Leu Lys Ser Gly Tyr Lys Asp Trp lie Leu 
260 265 270 

£2 l5 S- S£ t? A I" T0C miT8CTI TGT GTT GTT TTG CTG 

Trp lie ser Phe Ala He Ser Cys Phe Leu Leu Cys Val Val Leu Leu 

275 280 285 

SJ !S S iI G J* !? n° C ^ ^ CGC ATT AGC TGC ** C ATT 912 
y ?5n MSt Trp Ala Cy ° Gln L * B G1 y Asn Ile ^g Cys Asn lie 
"V 295 300 

TGC ATT 

Cys lie 918 
305 
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FIGURE 3 

nit 2£ I"* ^ C ACT CTG TCA AGC CTA «kX TGC TTT CTT TGG 

Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp Cys Phe Leu Trp 

CAT CTC CGC AAA CGA CTT CCA CAC CAA CAA CTA GCT GAT CCC CCA TTC 
Hxs Val Arg Lys Arg Val Ala Asp Gin Clu Leu Gly Asp Ala Pro Phe 
20 25 30 

CTT GAT CGC CTT CCC CCA CAT CAC AAA TCC CTA AGA CGA AGG GCC AGC 
Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Leu Arg Gly Arg Gly Ser 
J5 40 45 



48 



96 



144 



192 



336 



i£ E£ c?3 t" 0 ? AC A T C CAC ACA GCC ACA CCT GCT «» AAG CAC ATA 
Thr Leu Cly Leu Asp lie Clu Thr Ala Thr Arg Ala Cly Lys Cln lie 

50 55 60 

SI? £ AG f C ° A P CTG ^ GAA GAA TCC CAT 0M CCA CTT AAA ATC ACC 240 
vai Clu Arg lie Leu Lys Clu Clu Ser Asp Clu Ala Leu Lys Met Thr 
55 70 75 fl0 

tit 1*1 S AT ATG ™* ATT 0X0 GAC CTC CAC AAA TAC CTT CAA CAC ACT 288 
Met Asp His Met Leu He Cln Asp Leu Clu Lys Tyr Val Clu Asp Thr 
85 90 95 

AAA ATA CAT CTC TCC TCT TAC AAT GCG GAC CTT CTT GTC CCT CTG GAG 
Lys lie Asp Leu Trp Ser Tyr Asn Ala Clu Leu Leu Val Ala Leu Clu 
100 105 no 

^ ^ T 25* ^ GAT CTG ACT GAC TCC CAA ATC AAC AAA CTC TTT 384 
Aan Cln His Thr He Asp Leu Thr Asp Ser Clu Met Asn Lys Leu Phe 
115 120 125 

V£ ^ AGG °* CTC ACG GAA CCT CAC CAC ATC CCC AAT 432 

Clu Lys Thr Arg Arg Cln Leu Arg Clu Asn Ala Clu Asp Met Cly Asn 

111 S« 55 *** A ? A I AC CAC TCT 6X0 AAT CCT TCC ATA CCC TCA 480 
Cly Cys Phe Lys lie Tyr His Lys Cys Asp Asn Ala Cys lie Cly ser 
145 ISO 155 160 

ill J™ £ CG if 1 ™ CAC 0X1 GAT ^ TAC ACA CA C CAA CCA TTA 528 

lie Arg Asn Cly Thr Tyr Asp Bis Asp Val Tyr Arg Asp Clu Ala Leu 

165 170 175 

iiS ii« 2° ^ 5*° ATC ^ CCT CTT GAA CTC AAC TCA CCA TAC AAA 576 
Asn Asn Arg Phe Cln He Lys Cly Val Clu Leu Lys Ser Cly Tyr Lys 
180 185 190 

GAC TCC - ATC CTC TCC ATT TCC TTT CCC ATA TCA TGC TTT TTC CTT TCT 624 
Asp Trp lie Leu Trp He Ser Phe Ala He Ser Cys Phe Leu Leu Cys 
"5 200 205 

vIT vl? P G f 10 000 TTC ATC ATG TGG GCC TCC CAA AAA CGC AAC ATT 672 
Val Val Leu Leu Gly Phe He Met Trp Ala Cys Cln Lys Cly Asn He 
210 215 220 

ACC TGC AAC ATT TCC ATT 
Arg Cys Asn He Cys He 
225 230 
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FIGURE 4 

ATG GAT CCA AAC ACT GTG TCA AGC.TTT CAG -GTA GAT TCC TTT CTT TGG 
Met Asp Pro Asn Thr Val Ser Ser Phe Gin Val Asp ser Phe Leu Trp 
1 5 10 is 

CAT GTC CGC AAA CGA GTT GCA GAC CAA GAA CTA GGT GAT GCC CCA TTC 96 
His Val Arg Lys Arg Val Ala Asp Gin Glu Leu Gly Asp Ala Pro Phe 
20 25 30 

CTT GAT CGG CTT CGC CGA GAT CAG AAA TCC ATG CAT GGA TCA TAT GTT 144 
Leu Asp Arg Leu Arg Arg Asp Gin Lys Ser Met His Gly Ser Tyr Val 
35 40 45 

AAC AAG ACA CAA GAA GCT ATA AAC AAG ATA ACA AAA AAT CTC AAC TAT 192 
Asn Lys Thr Gin Glu Ala lie Asn Lys He Thr Lys Asn Leu Asn Tyr 
50 55 60 

TTA AGT GAG CTA GAA GTA AAA AAC CTT CAA AGA CTA AGC GGA GCA ATG 24 0 

Leu Ser Glu Leu Glu Val Lys Asn Leu Gin Arg Leu Ser Gly Ala Met 
65 70 75 80 

AAT GAG CTT CAC GAC GAA ATA CTC GAG CTA GAC GAA AAA GTG CAT GAT 288 
Asn Glu Leu His Asp Glu He Leu Glu Leu Asp Glu Lys Val Asp Asp 
85 90 95 

CTA AGA GCT GAT ACA ATA AGC TCA CAA ATA GAG CTT GCA GTC TTG CTT 336 
Leu Arg Ala Asp Thr He Ser Ser. Gin He Glu Leu Ala Val Leu Leu 
100 105 no 

TCC AAC GAA GGG ATA ATA AAC AGT GAA GAT GAG CAT CTC TTG GCA CTT 384 
Ser Asn Glu Gly He He Asn Ser Glu Asp Glu His Leu Leu Ala Leu 
115 120 125 

GAA AGA AAA CTG AAG AAA ATG CTT GGC CCC TCT GCT GTA GAA ATA GGG 432 
Glu Arg Lys Leu Lys Lys Met Leu Gly Pro Ser Ala Val Glu He Gly 
130 135 140 

AAT GGG TGC TTT GAA ACC AAA CAC AAA TGC AAC CAG ACT TGC CTA GAC 480 
Asn Gly Cys Phe Glu Thr Lys His Lys Cys Asn Gin Thr Cys Leu Asp 
"5 150 155 160 

AGG ATA GCT GCT GGC ACC TTT AAT GCA GGA GAT TTT TCT CTT CCC ACT 528 
Arg He Ala Ala Gly Thr Phe Asn Ala Gly Asp Phe Ser Leu Pro Thr 
165 170 175 

TTT GAT TCA TTA AAC ATT ACT GCT GCA TCT TTA AAT GAT CAT GGC TTC 576 
Phe Asp Ser Leu Asn He Thr Ala Ala Ser Leu Asn Asp Asp Gly Leu 
180 185 190 
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FIGURE 4 (con't) 



624 

Ser Ser Leu 

205 



s s £ si s a s s in ss gs s js e js «» 

215 220 
GAC AAT GTT TCT TGT TCC ATC TGT CTC 

Asp Asn val Ser Cys Ser lie cys Leu 699 
225 230 
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